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Abstract 

This  research  uses  an  advanced  statistical  technique  to  expand  upon  the  current 
understanding  of  war  termination.  Specifically,  this  thesis  addressed  questions 
concerning  the  most  relevant  factors  toward  predicting  both  the  outcomes  of  interstate 
wars  and  the  winners  of  intrastate  and  extra-systemic  wars,  within  the  limitations  of  the 
available  data.  Open-source  war  data  from  the  Correlates  of  War  Project  was  analyzed 
using  both  binary  and  multinomial  logistic  regression  techniques.  While  the  Correlates 
of  War  Project  did  not  necessarily  focus  its  data  collection  efforts  on  those  variables 
historically  associated  with  war  termination,  it  did  provide  a  sufficient  number  of 
variables  with  which  to  demonstrate  the  applicability  of  logistic  regression  techniques  to 
war  termination  analyses.  As  a  consequence,  every  significant  logistic  regression  model 
contains  a  single  relevant  variable.  For  both  intrastate  and  extra-systemic  wars,  the 
duration  of  the  conflict  was  found  to  be  most  relevant  to  predicting  the  winner.  In 
contrast,  the  proportion  of  total  casualties  borne  by  a  nation  in  an  interstate  war  was  most 
relevant  to  predicting  the  manner  in  which  an  interstate  war  ends.  Conclusions  drawn 
from  this  research  and  suggestions  for  future  statistical  applications  to  war  termination 
studies  were  also  discussed. 
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PATTERNS  OF  WAR  TERMINATION:  A  STATISTICAE  APPROACH 


I.  Introduction 


Background 

What  must  be  done  to  convince  an  enemy  to  give  up  armed  resistance?  Most  of 
the  research  on  wars  has  been  devoted  to  the  prevention  of  war.  Much  less  focus  has 
been  placed  on  studying  the  factors  involved  in  terminating  a  war  after  it  ensues  (Pillar, 
1983:3). 


Problem  Statement 

Permeating  throughout  war  termination  literature  is  the  lesson  that  deciding  how  a 
war  shall  end  is  just  as  important  as  deciding  how  a  war  shall  be  fought  (Ikle,  1991:1). 
Additionally,  ending  a  war  such  that  the  desired  state  of  peace  is  achieved  is  equally 
paramount.  Knowledge  must  be  gained  concerning  the  appropriate  amount  of  military 
force  required,  not  only  to  affect  the  cessation  of  hostilities,  but  also  to  contribute 
positively  to  the  planned  peace  (Ikle,  1991  :x).  Under  the  assumption  that  war  is  a 
complex  and  unstable  phenomenon,  it  is  appropriate  to  examine  war  termination  through 
a  probabilistic  lens.  What  factors  are  relevant  to  ceasing  armed  hostilities?  To  what 
degree  are  such  factors  significant?  Can  these  factors  be  controlled  or  manipulated? 
Given  specific  values  for  such  relevant  factors,  what  is  the  likelihood  of  achieving  one 


1 


outcome  versus  another?  Logistic  regression  analyses  on  historical  war  data  can  address 
these  questions  and  provide  objeetive  insights  into  existing  soeial  seienee  theories. 

Numerous  theories  on  war  termination  exist,  and  they  have  been  used  in  politieal 
and  social  science  cireles  to  explain  the  outcomes  of  past  wars.  However,  beyond 
elementary  statistieal  measures,  such  as  the  proportion  of  wars  sinee  1815  ending  by  a 
negotiated  settlement,  there  appears  to  be  a  lack  of  rigorous  applications  of  advanced 
statistical  methods  to  describe  how  wars  end.  As  a  consequence,  few  of  the  soeial 
science  theories  on  war  termination  can  be  eonsistently  applied,  given  similar  wartime 
eonditions  in  multiple  cases.  Authors  of  these  war  termination  studies  suggest  many 
methods  to  devise  a  successful  termination  strategy,  but  few  numerieal  methods  have 
been  employed  to  either  support  or  contradict  their  arguments. 


Research  Objectives 

This  thesis  sought  to  identify  the  key  factor  or  factors  that  contribute  to  the 
termination  of  an  armed  confliet  using  readily  available  open-souree  data.  The 
overarching  goal  was  to  demonstrate  the  applieability  of  logistic  regression  teehniques  to 
war  termination  analyses.  Onee  the  key  variables  were  identified,  the  next  phenomenon 
to  be  addressed  was  how  the  contributory  factors  influence  trends  in  both  how  wars  end 
and  who  wins  wars.  The  three  types  of  wars  were  analyzed  separately  to  identify  different 
war  termination  patterns  between  war  types.  This  study  also  sought  to  identify 
developing  trends  between  19*’’  and  20*’’  Century  warfare  beeause  the  open-source  data 
used  in  this  study  spanned  these  two  centuries.  One  sueh  pattern  is  the  ehange  in 
likelihood,  from  Napoleonie  to  modem  warfare,  that  a  partieular  combatant  wins  a  war. 
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The  change  in  the  likelihood  of  a  particular  outcome  between  centuries  was  also  of 
interest.  Any  wars  found  to  have  significant  effects  on  estimating  the  models  were 
identified  for  future  research. 


Limitations 

The  data  sets  used  for  this  research  were  obtained  from  the  Correlates  of  War 
Project  (COWP).  The  COWP  is  based  in  Urbana,  IL,  and  consists  of  scholars,  mostly 
political  scientists,  devoted  to  increasing  the  scientific  exploration  and  knowledge  of  war. 
The  group  was  founded  in  1963  by  political  scientist  J.  David  Singer,  who  was  soon 
joined  by  historian  Melvin  Small.  The  data  sets  compiled  by  the  COWP  consist 
primarily  of  variables  determined  by  the  group  to  be  relevant  to  the  onset  of  war,  such  as 
international  trade,  nonaggression  pacts,  defense  alliances,  geographic  contiguity, 
national  materiel  production,  and  diplomatic  representation. 

The  small  number  of  variables  for  which  the  COWP  collected  data  limited  the 
discovery  of  a  comprehensive  list  of  statistically  significant  war  termination  factors.  This 
limitation  also  restricted  the  size  and  implications  of  the  resulting  logistic  regression 
models.  There  are  more  variables  discussed  in  the  existing  war  termination  literature 
than  were  variables  within  the  COWP  data.  Consequently,  some  of  the  insights  gained 
from  the  social  science  realm  remain  open  to  further  investigation. 

The  data  sets  available  from  the  Correlates  of  War  Project,  which  also  included 
data  concerning  diplomatic  ties,  trade  agreements,  and  alliances  in  addition  to  war  data, 
were  compiled  by  different  persons.  Therefore,  it  was  difficult  to  pinpoint  similarities 
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between  data  sets.  The  ability  to  add  and  delete  variables  between  data  sets  such  that  the 
models  are  better  specified  also  requires  additional  investigations. 

Numerous  missing  entries  existed  within  each  of  the  data  sets.  While  valid 
statistical  techniques  can  be  used  to  fill  in  missing  data,  the  resulting  analyses  would  be 
more  useful  in  real-world  applications  if  the  data  were  complete.  The  sample  sizes  for 
each  of  the  three  data  sets  analyzed,  on  the  other  hand,  were  sufficiently  large  such  that 
the  observations  containing  missing  data  could  be  deleted  with  little  effect  on  the  model 
parameter  estimates. 


Research  Focus 

This  research  focused  on  the  analyses  of  data  concerning  three  types  of  wars: 
interstate,  intrastate,  and  extra-state  or  extra-systemic.  The  data  were  further 
distinguished  by  century.  That  is,  the  data  for  each  war  type  were  further  divided  into 
19*  and  20*  Century  data.  Interstate  wars  are  those  whose  participants  are 
internationally  recognized  nations.  Intrastate  wars  are  defined  as  armed  conflicts 
involving  belligerents  confined  within  a  nation’s  geographic  borders,  including  civil 
wars.  Extra-state  or  extra-systemic  wars  are  those  involving  state  and  non-state  actors, 
but  fighting  occurs  outside  the  nation’s  borders.  The  terms  extra-state  and  extra-systemic 
both  define  the  same  type  of  war,  so  they  are  used  interchangeably  throughout  this  thesis. 

A  review  of  existing  social  science  literature  on  war  termination  was  conducted. 
The  level  of  attention  previously  devoted  to  the  subject  of  war  termination  was  addressed. 
The  literature  review  also  discussed  the  subjective  methods  used  in  prior  studies  to 
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classify  the  types  of  war  termination.  These  prior  elassifications  provided  a  basis  from 
whieh  to  eonstruet  the  war  termination  eategories  used  in  this  study. 

Two  sourees  of  logistic  regression  theory  were  reviewed.  The  work  of  Hosmer 
and  Lemeshow  explained  virtually  all  of  the  teehniques  and  methods  used  in  logistie 
regression.  The  eontribution  by  Montgomery,  Peek,  and  Vining  to  this  study  was  a 
thorough  deseription  of  the  least  squares  method  used  to  estimate  the  logistie  regression 
model  parameters. 

Subsets  of  variables  from  the  original  COWP  data  were  seleeted.  These 
selections  were  made  based  primarily  on  relevant  faetors  discussed  in  the  soeial  seienee 
literature  on  war  termination.  Additionally,  the  sets  of  seleeted  variables  were  further 
limited  by  variable  availability  in  the  COWP  data.  That  is,  several  factors  deemed 
important  by  soeial  seientists  were  not  available  in  the  COWP  data.  The  variable 
restriction,  however,  did  not  adversely  affect  the  overall  intent  of  this  study,  which  was  to 
demonstrate  the  applieability  of  logistie  regression  teehniques  to  war  termination 
problems.  A  suffieient  number  of  variables  were  provided  by  the  COWP  sueh  that  the 
effeetiveness  and  potential  of  logistie  regression  applieations  to  war  termination  could  be 
shown. 

Stepwise  seleetion  is  a  robust  procedure  that  was  used  to  determine  an  initial  set 
of  statistieally  signifieant  variables  for  eaeh  fitted  model.  Stepwise  seleetion  was 
eondueted  on  the  variables  for  the  19**'  Century,  20*  Century,  and  aggregated  data  for 
eaeh  type  of  war.  The  results  from  the  stepwise  proeedure  were  used  to  estimate  initial 
logistic  regression  models.  The  initial  models  were  each  assessed  for  goodness-of-fit  and 
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individual  covariate  significance.  The  significance  tests  confirmed  either  the  overall 
adequacy  of  an  initial  model  or  the  need  to  fit  a  reduced  model. 

The  statistical  software  program  used  in  this  study  was  MINITAB.  Several 
software  packages  have  been  programmed  to  fit  and  analyze  logistic  regression  models, 
but  MINITAB  was  chosen  for  two  reasons.  One,  MINITAB  was  readily  available  and 
accessible.  Secondly,  MINITAB  had  been  programmed  to  support  binary  logistic 
regression,  multinomial  logistic  regression,  and  virtually  all  of  the  significance  tests, 
goodness-of-fit  tests,  diagnostic  measures,  and  diagnostic  plots  necessary  for  this 
investigation. 

Each  of  the  final  models  was  assessed  for  overall  adequacy  using  three 
statistically  equivalent  goodness-of-fit  tests.  Individual  co variate  significance  was  also 
determined  through  tests  on  their  coefficients.  The  implications  of  each  model  were  also 
interpreted.  Three  types  of  residual  plots  were  examined  for  influential  observations. 
Once  identified,  the  influence  points  were  analyzed  for  their  net  effects  on  model 
coefficient  estimations.  When  necessary,  new  models  were  fit  with  the  influential  data 
points  deleted. 

A  general  assessment  of  the  findings  of  this  study  was  given.  War  termination 
implications  across  two  centuries  of  warfare  and  across  three  types  of  wars,  given  the 
open-source  data  used,  were  stated.  Opportunities  for  future  statistical  studies  on  war 
termination  were  considered.  In  addition,  proposals  for  additional  applications  of  logistic 
regression  methods  to  war  termination  were  discussed. 
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II,  Literature  Review 


General 

Few  will  deny  that  all  wars  do  not  end  in  the  same  manner,  yet  not  enough 
attention  is  paid  to  the  elements  contributing  to  the  conclusion  of  wars.  Fred  Ikle 
addresses  the  one-way  street  between  how  wars  begin  and  how  they  end,  and  he  insists 
that  the  process  of  termination  has  the  longest  lasting  effect  on  the  ensuing  peace  than 
any  other  element  of  war  (Ikle,  1991  :vii).  One  need  look  no  further  than  to  German 
actions  during  World  War  I  and  to  French  actions  after  World  War  I  to  accept  Ikle’s 
assessment  as  an  axiom  of  war.  Germany  launched  its  unrestricted  submarine  warfare 
campaign  in  1916  with  the  intent  to  inflict  massive  panic  upon  the  British  population  and 
end  the  war  on  German  terms,  but  the  campaign  instead  served  the  unintended 
consequence  of  drawing  the  United  States  into  the  war,  which  hastened  Germany’s  defeat 
(Ikle,  1991  :xi).  Germany’s  perceived  military  excesses  during  World  War  I  led  to  French 
insistence  that  the  Versailles  treaty  punish  Germany  economically  through  massive  war 
reparations  and  humiliate  Germany  diplomatically  by  forcing  her  to  accept  the  aggressor 
label.  The  eventual  rise  of  Adolf  Hitler  and  Nazi  Germany  can  be  traced  back,  at  least  in 
part,  to  French  contributions  to  the  Treaty  of  Versailles. 

Classifying  the  manners  in  which  wars  end  is  important  to  a  probabilistic  analysis 
of  war  termination.  Paul  Pillar  conducts  such  a  classification  in  his  analyses.  However, 
he  postulates  that  most  future  wars  will  end  through  negotiated  agreements,  so  his 
classification  of  the  types  of  war  termination  is  influenced  by  this  assertion.  It  must  first 
be  determined  whether  combat  ends  at  the  same  time  as  the  war  (Pillar,  1983:1 1).  For 
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example,  Serbia  and  Turkey  signed  a  peaee  treaty  in  Mareh  1877,  whieh  teehnieally 
ended  the  First  Balkan  War,  but  some  Serbian  forees  eontinued  to  fight  the  Turks  through 
the  beginning  of  the  Russo-Turkish  War  in  April  1877  (Pillar,  1983:22).  Pillar  elassifies 
this  type  of  war  termination  as  absorption.  That  is,  the  ending  of  a  small  war  is  marked 
by  one  or  more  of  its  belligerents  beeoming  involved  in  a  larger  war.  If  eombat  does 
indeed  end  simultaneously  with  the  war,  then  it  should  be  determined  whether  the 
fighting  ended  beeause  of  a  mutual  agreement  by  all  belligerents  or  beeause  one  side 
applied  suffieient  military  foree  to  the  opposition  sueh  that  its  enemy  eould  no  longer 
eontinue.  If  the  latter  is  the  ease,  then  Pillar  denotes  this  type  of  war  termination  as 
extermination  or  expulsion.  When  all  sides  mutually  deeide  to  end  the  war,  then  Pillar 
notes  either  the  existenee  or  absenee  of  a  written  agreement.  Pillar  defines  withdrawal  as 
a  war  whieh  terminates  without  a  written  agreement  (Pillar,  1983: 14). 

For  explieit  agreements.  Pillar  distinguishes  between  those  negotiated  by  the 
belligerents  themselves  and  those  negotiated  by  third  parties.  Pillar  further  assumes  that 
international  organizations  have  almost  always  played  the  role  of  the  third  party  in 
written  negotiations.  As  sueh,  he  uses  the  term  international  organization  to  denote  the 
eategory  for  wars  in  whieh  a  third  party  aids  in  written  agreements  (Pillar,  1983:15). 

When  formal  settlements  are  handled  by  the  belligerents  themselves.  Pillar 
diseerns  whether  or  not  a  settlement  is  imposed  by  one  side  upon  the  other.  If  this  is  the 
ease,  then  eapitulation  has  oeeurred.  If  the  settlement  is  indeed  mutually  negotiated,  then 
Pillar  differentiates  between  agreements  negotiated  before  an  armistiee  and  those 
negotiated  after  an  armistiee  (Pillar,  1983:15).  These  distinetions  add  support  to  the 
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construction  of  a  polychotomous,  or  multi-category,  dependent  variable  on  the  outcomes 
of  wars. 

With  the  response  variable  defined,  the  focus  of  investigation  must  neeessarily 
shift  towards  the  eommon  faetors  that  eontribute  to  stopping  a  given  war.  Additionally, 
attention  should  be  given  to  the  manner  in  which  a  war  ends,  not  just  why  it  ends.  For 
example,  the  proportion  of  total  easualties  taken  by  one  belligerent  may  prove  to  be  more 
signifieant  if  the  war  ends  through  eapitulation  than  if  it  ends  through  a  negotiated 
settlement.  Beeause  every  war  is  different,  only  a  few  termination  variables  are  present 
in  all  wars. 

Ikle  points  out  the  obvious  eeonomie  and  soeial  costs  of  casualties  and  military 
expenditures  (Ikle,  1991:1).  Even  with  the  ongoing  Operation  Iraqi  Freedom  (OIF),  the 
most  eommonly  eited  measures  are  the  numbers  of  US  dead  and  wounded,  Iraqi  eivilian 
deaths,  and  the  billions  of  dollars  per  month  spent  on  the  eonfiict.  Most  other  faetors 
mentioned  in  the  literature  are  qualitative  in  nature.  As  a  eonsequenee,  limited  data  is 
available  for  these  faetors,  and  their  relevanee  is  largely  based  on  hindsight,  eonjeeture, 
and  inferenee. 

There  does  exist  at  least  one  ease  where  these  subjeetive  variables  are  applied  to 
soeial  seienee  war  termination  theories  using  what  eould  be  eonsidered  survey  data  as 
supporting  evidenee.  Joseph  Engelbreeht,  in  his  analyses  of  four  war  termination 
theories,  uses  transeripts  from  interviews  with  Japanese  officers  captured  during  World 
War  II  to  support  his  eonelusions  (Engelbreeht,  1992:82-87).  His  eonelusions,  however, 
seek  to  explain  why  wars  end  rather  than  to  relate  the  relevant  faetors  to  speeifie  types  of 
war  endings.  His  ease-study  approaeh  only  addresses  one  type  of  war  termination: 
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surrender  or  capitulation.  In  two  of  the  three  cases,  the  Japanese  surrender  in  1945  and 
the  Afrikaner  surrender  to  the  British  in  South  Africa  in  1902,  a  formal  settlement  to  the 
conflict  was  reached. 

Two  interesting  political  science  theories  on  war  termination  are  considered  by 
Engelbrecht  and  tested  against  three  cases.  One  theory  is  based  on  a  winners  and  losers 
approach.  The  other  focuses  on  cost/benefit  analyses.  The  three  test  cases  he  used  were 
the  Japanese  decision  to  surrender  in  August  1945,  the  Afrikaner  decision  to  surrender  to 
the  British  in  South  Africa  in  1902,  and  the  British  decision  to  continue  fighting  the  Nazis 
following  the  fall  of  France  in  1940.  He  applied  each  theory  to  each  case,  analyzed  the 
particulars  of  each  case,  and  determined  which  theory  best  fit  the  decisions  made  in  each 
case  (Engelbrecht,  1992:61-63). 

The  Winners  and  Losers  model  identifies  two  outcomes  of  war  and  emphasizes 
that  one  side  is  the  clear  victor,  and  the  other  side  is  the  vanquished.  This  model  stresses 
the  defeat  of  enemy  military  forces  as  the  key  to  convincing  the  enemy  to  either  seek  a 
peaceful  settlement  or  surrender.  This  theory  is  commonly  applied  when  one  can  identify 
a  specific  battle  or  campaign  that  marks  a  turning  point  in  the  war  (Engelbrecht,  1992:63- 
64). 

For  example,  the  the  Battle  of  Midway  in  1943  is  identified  as  the  battle  that 
turned  the  tide  of  World  War  II  against  Imperial  Japan.  Interrogations  of  Imperial 
Japanese  military  officers  at  the  end  of  World  War  II  confirmed  that  the  American 
victory  at  Midway  signaled  the  eventual  defeat  of  Japan  (Engelbrecht,  1992:82-87).  In 
the  Afrikaner  case,  the  fall  of  Pretoria  in  1900  turned  the  tide  of  the  Anglo-Boer  War 
against  the  Boer  forces  (Engelbrecht,  1992:155-157). 
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The  German  blitzkrieg  through  the  Ardennes,  the  defeat  of  the  British 
Expeditionary  Foree  (BEE)  in  Belgium,  and  the  fall  of  Franee  were  devastating  defeats  to 
the  United  Kingdom  in  1940,  yet  the  British  refused  to  negotiate  or  surrender.  However, 
the  defeated  nation  must  eapitulate  soon  after  sueh  turning  points  in  order  for  the  Winners 
and  Eosers  theory  to  be  valid  (Engelbrecht,  1992:215).  In  all  the  eases  described  above, 
the  defeated  nation  did  not  immediately  surrender,  despite  heavy  battlefield  losses.  The 
Afrikaners  did  not  surrender  to  the  British  until  1902.  The  Japanese  surrender  did  not 
come  until  1945,  yet  the  interrogated  Japanese  officers  deemed  the  surrender  inevitable, 
even  without  the  atomic  bomb  attacks  on  Hiroshima  and  Nagasaki.  On  the  other  hand, 
the  British  never  surrendered  or  talked  of  peace  with  Nazi  Germany.  Why?  Why  did 
surrender  eventually  occur  in  all  the  other  cases,  except  the  British?  The  same  conditions 
of  a  humiliating  military  defeat  existed  in  all  the  cases,  yet  surrender  did  not  always 
occur. 

The  Cost  Benefit  model  focuses  on  comparing  the  costs  of  prosecuting  a  war  with 
the  achievement  of  the  war’s  objectives.  For  this  theory  to  be  applicable,  the  losing 
nation  is  expected  to  first  weigh  the  costs  of  war.  That  is,  it  must  consider  the  raw 
numbers  of  human,  war  weapon,  logistic,  and  economic  losses.  Then,  the  losing  nation 
must  determine  whether  or  not  its  war  aims  can  still  be  reasonably  met.  If  its  war 
objectives  cannot  reasonably  be  met,  then  the  Cost  Benefit  model  implies  that 
capitulation  must  occur  (Engelbrecht,  1992:30-32).  In  all  three  cases  analyzed  by 
Engelbrecht,  no  evidence  suggested  the  use  of  any  rational  cost  benefit  analyses  to  decide 
the  question  of  war  termination,  at  least  while  the  war  was  ongoing.  That  is  not  to  say 
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that  costs  were  not  diseussed,  but  sueh  discussions  did  not  directly  produce  a  decision  to 
surrender,  or  in  the  British  ease,  to  eontinue  fighting  (Engelbreeht,  1992:32-33). 

James  Walker  begins  his  Naval  War  College  study  by  addressing  the  question  of 
why  war  termination  plans  should  be  eonsidered.  He  notes  that  the  majority  of  wars 
sinee  1800  have  ended  with  negotiated  peaee  agreements.  This  fact  moves  the  purpose  of 
military  foree  away  from  the  wholesale  destruetion  of  enemy  forees  on  the  battlefield  and 
toward  the  applieation  of  sufficient  foree  to  achieve  diplomatie  and  politieal  goals.  He 
points  to  the  numerous  Arab-Israeli  wars  to  support  the  idea  of  this  paradigm  shift.  The 
undefeated  military  reeord  of  Israel,  most  notably  in  its  War  of  Independenee  in  1948,  the 
Six  Day  War  in  1967,  and  the  Yom  Kippur  War  in  1973,  has  aehieved  neither  a  lasting 
peace  nor  a  resolution  of  the  political,  social,  and  religious  issues  between  Israel  and  her 
Arab  neighbors.  Dynamie  politieal,  diplomatie,  soeial,  and  eultural  issues  lend  even 
more  importanee  to  war  termination  planning  (Walker,  1996:1-2). 

Walker  notes  that  war  termination  is  mentioned  in  the  joint  military  doetrine  of 
the  United  States,  but  the  attention  it  is  given  is  brief  and  the  language  vague.  He 
deseribes  a  state  of  tunnel  vision  resulting  from  Ameriea’s  status  as  the  lone  superpower. 
That  is,  military  eommanders  falsely  assume  that  the  mere  overwhelming  applieation  of 
Ameriea’s  superior  weapons  and  firepower  will  automatieally  produce  the  desired  peaee 
(Walker,  1996:2-4).  This  assessment  essentially  eehoes  a  similar  statement  made  by  Ikle, 
where  Ikle  asserted  that  military  power  should  be  applied  only  to  the  extent  that  it  will 
eontribute  positively  to  the  desired  peaee,  and  such  applications  should  be  explieitly 
defined  in  military  strategies  for  war.  Ikle  maintains  that  the  indiscriminant  destruction 
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of  enemy  forees  and  civilians  is  most  detrimental  to  the  desired  atmosphere  of  peace 
(Ikle,  1991:  ix-xi). 

The  products  of  termination  agreements  must  be  considered.  Will  written 
documents  be  drafted  and  signed  by  all  parties?  If  so,  will  it  be  a  formal  treaty?  If  not  a 
treaty,  will  it  be  an  armistice  or  limited  cease-fire?  Walker  highlights  these  details  for 
two  reasons.  One,  the  Gulf  War  negotiations  yielded  no  written  agreements,  only  audio 
recordings.  Two,  Walker  emphasizes  the  international  legitimacy  behind  written 
agreements.  Although  only  treaties  are  legally  binding,  written  agreements,  in  general, 
still  provide  a  certain  degree  of  political  and  diplomatic  leverage  in  the  event  that  one 
side  eventually  breaks  the  deal  (Walker,  1996:12-13).  Unlike  Pillar,  Walker  treats 
armistices  and  cease-fires  as  actual  termination  agreements  rather  than  conditions  upon 
which  formal  war  settlements  hinge. 

Emphasizing  the  importance  of  war  termination  in  both  doctrine  and  training  is 
the  method  Walker  offers  with  respect  to  how  to  plan  for  war  termination.  Beyond  that, 
he  only  stresses  drafting  war  termination  plans  early  in  the  strategic  planning  cycle.  As 
with  other  operational  plans,  war  termination  plans  should  be  updated  according  to  the 
progression  of  affairs  in  the  war.  Alternatives  within  the  termination  plans  should  be 
analyzed,  and  contingencies  should  also  be  considered  (Walker,  1996:13-14).  Rather 
than  provide  guidance  on  war  termination  methods.  Walker  focuses  on  the  lack  of 
attention  given  to  and  the  necessity  for  early  planning  of  war  termination  (Walker, 
1996:16). 
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Correlates  of  War  Project  (COWP) 


The  COWP  is  an  organization  that  provides  open-source  data  on  wars  and  factors 
which  account  for  wars.  The  COWP  has  compiled  thirteen  data  sets.  These  sets  contain 
variables  concerning  state  system  membership,  interstate  wars,  intrastate  wars,  extra- 
systemic  wars,  militarized  interstate  disputes,  national  materiel  capabilities,  formal 
alliances,  territorial  changes,  geographic  contiguity,  colonial  dependency, 
intergovernmental  organizations  (IGOs),  diplomatic  representation,  and  bilateral  trade. 

In  the  context  of  a  war  termination  study,  the  interstate,  extra-state,  and  intrastate  war  sets 
are  of  primary  interest.  The  interstate  set  contains  data  concerning  the  nations 
participating  in  79  interstate  wars  from  1823  to  1991.  The  intrastate  set  contains  data 
concerning  the  state  belligerents  in  213  intrastate  wars  from  1816  to  1997.  The  extra¬ 
state  set  contains  data  concerning  the  state  actors  in  108  extra-systemic  wars  from  1817 
to  1983.  Appendix  A  shows  the  variables  included  in  each  of  the  three  war  data  sets  and 
their  definitions  as  assigned  by  the  COWP. 


Statistical  Application 

Suppose  the  response  variable  in  a  statistical  study  on  war  termination  is  the 
winner  of  a  war.  Either  a  particular  combatant  wins,  or  his  opponent  does.  He  succeeds 
in  defeating  his  opponent  or  his  enemy  defeats  him.  Since  this  response  has  only  two 
possible  outcomes,  and  its  category  definitions  are  arbitrary,  the  winner  of  a  war  can  be 
defined  as  a  Bernoulli  random  variable  (Montgomery,  Peck,  and  Vining,  2001:443-444). 
That  is,  each  category  for  the  winner  has  a  probability  attached  to  it.  As  a  contemporary 
example,  let  Yj  denote  the  winner  of  the  y'*  extra-systemic  war,  which  involved  the 
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United  States  and  the  terrorist  group  Hamas.  Let  j  denote  the  j"'  extra-state  war  from  a 
sample  of  n  extra-state  wars,  where  j  =  \,2,...,n .  If  7^.  =  0  ,  then  Hamas  is  the  winner.  If 
7^.  =  I ,  then  the  United  States  is  the  winner.  Sinee  7^.  is  a  Bernoulli  random  variable,  the 
probability  that  Y.  =  0  and  the  probability  that  Y.  =  1  are  the  quantities  under 

investigation  (Montgomery,  Peek,  and  Vining,  2001  ;444).  The  goal  now  is  to  determine 
a  mathematieal  relationship  between  who  wins  an  extra-systemie  war  and  appropriate 
eontributory  or  predictor  variables. 

Alternatively,  suppose  the  response  variable  in  a  statistical  study  on  war 
termination  is  the  manner  in  which  a  war  terminates.  More  than  two  types  of  war 
termination  have  been  defined  to  exist,  so  the  response  is  polychotomous  or  multi¬ 
category.  The  probabilities  for  the  different  types  of  war  termination  are  still  of  interest, 
but  each  war  termination  probability  is  compared  to  a  reference  or  baseline  war 
termination  probability  (Hosmer  and  Lemeshow,  2000:260-261).  That  is,  the  type  of  war 
termination  that  is  most  prevalent  is  selected  to  be  the  reference  category,  and  the 
remaining  categories  are  compared  to  it.  Mathematical  relationships  between  each 
comparison  and  several  predictor  variables  can  now  be  established.  In  this  case,  the 
objective  can  be  to  determine  how  likely  one  type  of  war  termination  is  to  occur  over  the 
baseline  war  termination  method  (Hosmer  and  Lemeshow,  2000:265). 

Once  the  response  is  identified  and  its  structure  defined,  a  set  of  candidate 
predictor  variables  is  compiled.  Advanced  statistical  techniques  can  be  applied  to  these 
candidate  variables  to  determine  the  strengths  of  their  relationships  to  the  response.  The 
results  from  such  techniques  can  justify  the  retention  or  elimination  of  some  of  the 
candidate  variables. 
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Logistic  Regression 


Because  this  thesis  focuses  on  analyses  performed  on  existing  data,  a  regression 
technique  is  an  effective  way  of  describing  the  relationship  between  how  a  war  ends,  or 
who  wins  a  war,  and  the  factors  contributing  to  such  outcomes.  The  outcome  of  a  war  is 
not  a  continuous  variable,  so  classical  linear  regression  is  not  a  valid  approach.  Instead, 
this  thesis  seeks  to  assess  the  likelihoods  of  different  outcomes  of  war,  and  such 
likelihoods  can  be  derived  from  conditional  probabilities.  Logistic  regression  is  the 
preferred  method  for  this  approach,  primarily  because  the  outcome  variables  are  discrete 
categorical  variables,  either  binomial  or  multinomial  (Hosmer  and  Lemeshow,  2000:1). 
Some  texts  use  the  synonymous  terms  binary  or  dichotomous  when  referring  to  a  logistic 
regression  model  with  a  two-category  response.  They  also  may  use  the  terms 
polychotomous  or  polytomous  when  referring  to  a  logistic  regression  model  with  a 
response  containing  three  or  more  categories  (Hosmer  and  Lemeshow,  2000:260). 

The  nature  of  the  response  variable  determines  the  type  of  parametric  model  to  be 
used.  It  also  determines  the  assumptions  that  can  be  made.  In  linear  regression,  the 
response  is  continuous,  and  the  distribution  of  the  response  is  assumed  to  be  normal.  The 
outcome  of  a  war,  however,  is  not  a  continuous  random  variable  as  defined  in  this  study. 
Similarly,  the  winner  of  a  war  is  not  a  continuous  random  variable.  Thus,  the  normality 
assumption  no  longer  holds  for  the  responses  in  this  study.  These  responses  must  be 
described  by  a  different  probability  distribution  (Hosmer  and  Lemeshow,  2000:1). 

As  with  linear  regression,  model  parsimony  is  also  desired  with  logistic 
regression.  That  is,  fitting  the  model  with  the  smallest  number  of  contributory  variables. 
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or  covariates,  that  best  deseribes  the  relationship  between  an  outeome,  or  response,  and  a 
set  of  covariates,  or  predictors  (Hosmer  and  Lemeshow,  2000:1).  The  model  can  contain 
either  continuous  variables,  eategorieal  variables,  or  both. 


Binary  Logistic  Regression. 

The  theory  behind  binary  logistie  regression  is  eommonly  explained  using  a 
univariate  model,  where  only  one  eovariate  is  present.  The  teehniques  are  readily 
adapted  to  multivariate  cases.  The  focal  quantity  for  binary  logistic  regression  is  the 
eonditional  probability  of  the  mean  of  the  response,  given  a  eertain  value  of  the  eovariate. 
That  is,  P{Y  =  i\  x  =  7) .  Several  eumulative  distributions  have  been  proposed  and  used 

to  fit  models  for  this  eonditional  probability,  but  the  logistie  distribution  is  used  for 
logistie  regression  beeause  of  its  ease  of  interpretation.  The  binary  logistie  regression 
model  is  of  the  form 


n 


+  e 


(2.1) 


where  ;z'(x)  =  P{Y  \  x)  represents  the  eonditional  probability  of  the  response  Y  given  the 
eovariate  x  (Hosmer  and  Lemeshow,  2000:6).  For  the  multivariate  case,  let 
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The  method  of  maximum  likelihood  is  used  to  estimate  the  parameters  of  the 
model,  but  the  model  must  be  transformed  and  made  linear  in  its  parameters  and  . 

The  transformation  used  is  called  the  logit.  The  logit  is  defined  in  terms  of  ;r(x) . 

g(x)  =  ln  (2.3) 

For  multiple  covariates,  the  logit  becomes 

l-;r(x) 

It  should  be  noted  that  the  quantity  ;r(x)/(l-;r  (^))  is  called  the  odds,  that  is,  the  ratio 

of  the  probability  of  success  to  the  probability  of  failure.  Therefore,  the  logit  is  also 
called  the  log-odds  (Montgomery,  Peck,  and  Vining,  2001:445-446). 

An  observation  of  a  dichotomous  response  given  x  is  expressed  as  y  =  n{^x)  +  s , 

but  the  assumption  of  normality  in  the  distribution  of  the  error  term  s  does  not  apply  in 
this  case,  as  it  does  in  linear  regression  (Hosmer  and  Lemeshow,  2000:6).  Instead,  the 
errors  follow  the  binomial  distribution,  with  a  mean  or  expected  value  of  zero  and  a 
variance  equal  to  the  product  of  the  probability  that  y  =  \  and  the  probability  that  y  =  0 . 


That  is, 

s  =  \-n{^x)  with  probability  ;z-  (x) ,  for  y  =  1 ,  (2.5) 

e  =  -7r[x)  with  probability  l-;r(x) ,  for  y  =  0,  (2.6) 

£■(£■)  =  0,  and  (2.7) 

Far(^)  =  ;r(x)[l-;r(x)]  .  (2.8) 


-  -  fio  +  P\^\  +  +  •  •  •  + 


g(x)  =  ln 
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Constructing  the  likelihood  function  is  the  first  step  towards  estimating  the 
logistie  regression  model  parameters.  Let  denote  one  observation  out  a  set  of 

independent  observations,  where  is  the  binary  response,  Xj  is  the  value  of  the 
eovariate  for  the  j“'  observation,  and  j  =  \,2,...,n  (Hosmer  and  Lemeshow,  2000:7). 
The  contribution  of  to  the  likelihood  function  is  expressed  as  an  independent 


Bernoulli  trial,  or 


\i-7/ 


(2.9) 


Sinee  there  are  n  independent  Bernoulli  trials,  and  eaeh  trial  eontributes  to  the  likelihood 
funetion,  then  the  likelihood  function  becomes  the  produet  of  independent  trials,  or 


7=1 


(2.10) 


In  order  to  find  the  values  of  and  (3^  that  maximize  equation  (2.10),  the  natural 

logarithm  of  equation  (2.10),  the  log-likelihood  function,  is  computed  beeause  it  is  easier 
to  manipulate  (Hosmer  and  Lemeshow,  2000:8).  Differentiating  the  log-likelihood 
function  L[/3q,/3^)  with  respect  to  ySg  and  /3^ ,  and  setting  each  resulting  partial 
differential  equation  to  zero,  yields  the  likelihood  equations. 

L[/3^,/3,)  =  ^[yfri[nj)  +  [\-yj)\n[\-n^)\  (2.11) 

7=1 


y=i 

n 

Xx,(y,-;r(x,))  =  0 

7=1 


(2.12) 


(2.13) 


Using  veetor  notation,  the  form  of  the  log-ltkelihood  funetion  for  multivariate  eases  is 
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(2.14) 


^  (^)  =  Z  y/jP  -  Z  In  [l  +  exp  (xj^)] 

7=1  7=1 

(Montgomery,  Peek,  and  Vining,  2001:448).  Beeause  the  likelihood  equations  are 
nonlinear  in  their  parameters,  a  elosed-form  solution  is  not  possible.  An  iterative  seareh 
method  ealled  iteratively  reweighted  least-squares  (IRLS)  is  implemented  to  obtain 
solutions  (Hosmer  and  Lemeshow,  2000:9). 

Most  modem  statistieal  software  paekages  that  fit  logistie  regression  models  have 
this  iterative  seareh  method  programmed  into  them.  IRLS  employs  the  Newton-Rhapson 
algorithm  as  a  robust  method  to  approximate  solutions  to  the  likelihood  equations. 
Hosmer  and  Lemeshow  do  not  deseribe  the  details  of  IRLS,  but  the  interested  reader 
should  refer  to  Montgomery,  Peek,  and  Vining  for  a  eomplete  explanation  of  IRLS 
(Montgomery,  Peek,  and  Vining,  2001:610-613). 

Let  p  be  the  final  IRLS  estimate.  Then,  the  logit  beeomes  g(x^  ^  =  x^.p ,  and  the 
fitted  logistie  regression  model  beeomes 

expfx^yff) 

^J  = - -hk  (2-15) 

l  +  exp(x^.y0) 

(Montgomery,  Peek,  and  Vining,  2001:449). 


Parameter  Interpretation. 

For  the  binary  model,  the  fitted  value  of  its  logit  at  a  partieular  value  of  its  single 
eovariate  is  g  -l-  Pxj .  Let  the  value  of  the  logit  at  x^.+l 
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be  g  + 1^  =  +  y0j  {xj  + 1^  =  +  y0j  +  y^jX^. .  Therefore,  the  differenee  between  the  two 

fitted  logit  values  is 

g[xj+^)-g[xj)  =  =  p, 


.  \  \  ;z-(x. +l)  ^{xj) 

g  x^+1  -g  xj  =  ln  - - -In  - - ^ 

l-;z'(x^. +11  l-;r(x^. 


f  odds 

^  =ln  - ^ 

j  oddsj 


If  the  antilogarithm  of  the  above  quantity  is  taken,  then  the  result  is  called  the  odds  ratio. 


(2.16) 


which  is  the  estimated  change  in  n  per  one-unit  change  in  the  covariate  x  .  For 
multivariate  models,  is  the  estimated  change  in  n  per  one-unit  change  in  the 


covariate,  given  that  the  values  for  the  remaining  k-\  covariates  are  constant 
(Montgomery,  Peck,  and  Vining,  2001:452). 

Odds  ratios,  rather  than  the  parameter  estimates,  are  used  to  describe  the  results  of 
a  fitted  binary  logistic  regression  model.  For  example,  suppose  that  a  binary  logistic 
regression  model  on  the  winner  of  an  intrastate  war  contains  the  length  of  the  conflict  as 
the  predictor  variable,  and  suppose  Y  denotes  the  binomial  random  variable  for  the 
winner.  Let  7  =  0  denote  that  the  state  actor  wins  the  intrastate  war,  and  let  7  =  1  denote 
that  the  non-state  actor,  rebel  faction,  or  insurgency  wins  the  war.  In  addition,  suppose 
that  2.5  is  found  to  be  the  odds  ratio  for  this  model  when  the  duration  of  the  war  is  1440 


days.  It  can  then  be  said  that  the  non-state  belligerent  is  two  and  a  half  times  more  likely 
to  win  an  intrastate  war  than  is  the  state,  given  that  the  war  lasts  1440  days. 
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Measuring  the  difference  between  observed  and  fitted  values,  or  residuals,  to 
assess  a  model’s  goodness-of-fit  can  be  performed  by  manipulating  likelihood  ratios. 
That  is,  the  IRLS  estimates  for  the  parameters  in  equation  (2.3)  are  substituted  into  the 
log-likelihood  function  (2.1 1),  which  maximizes  the  value  of  the  log-likelihood  function. 
By  noting  that  a  saturated  model  is  one  whose  sample  size  is  equal  to  the  number  of 
parameters  it  contains,  or  n  =  k  + 1 ,  the  difference  between  the  log-likelihood  of  this 
saturated  model  and  the  log-likelihood  of  the  fitted  model  is  examined  to  determine  the 
fitted  model’s  adequacy. 

The  deviance  D  of  the  fitted  model  approximately  possesses  a  chi-square 
distribution  with  n-{k  +  \)  degrees  of  freedom.  The  test  statistic  is  given  by 


I  {saturated)^ 

Kh  . 

Multiplying  the  natural  logarithm  of  the  likelihood  ratio  by  2  allows  the  deviance  to 
approximate  a  chi-square  distribution  (Hosmer  and  Lemeshow,  2000:13).  If  Z)  <  , 

then  the  fitted  model  is  appropriate;  D>  xl  implies  that  the  fitted  model  is 

incorrectly  specified  (Montgomery,  Peck,  and  Mining,  2001:453).  The  quantity  a  is  the 
specified  level  of  significance;  0.05  is  the  a  level  used  for  this  research. 

The  second  commonly  conducted  test  is  the  Pearson  chi-square  statistic.  Let  J be 
the  number  of  distinct  values  of  the  covariate  observed  in  the  data  set,  and  let  nij  be  the 

frequency  of  the  f"  distinct  covariate  value,  where  y  =  1, 2, . . . ,  J  .  For  the  purpose  of 


2 1 L  {^saturated )  -  T  ( 


(2.17) 


Z)  =  21n 
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computing  Pearson  residuals,  let  y.  be  the  frequeney  of  the  f"  distinct  covariate  value 
for  which  y  =  1 .  It  follows  that  the  sum  of  the  m.  fitted  values  is 


exp 


-  ifij 


(^h)) 


l  +  exp(g(xj) 


(2.18) 


Thus,  the  Pearson  residual  for  the  f"  distinct  covariate  value  is  given  by 


(2.19) 


The  Pearson  chi-square  statistie,  ,  is  the  sum  of  the  squares  of  the  Pearson  residuals. 

(2.20) 

,/=l 

As  implied  by  its  name,  the  Pearson  ehi-square  statistic  follows  a  distribution  with 
J  -{k  +  \)  degrees  of  freedom.  The  fitted  model  is  said  to  be  eorreetly  specified  if 
j_t-\  (Hosmer  and  Lemeshow,  2000:145-146). 

To  conduct  the  Hosmer-Lemeshow  test,  the  observations  are  grouped  using  the 
following  method.  Ten  groups  are  created  such  that  each  group  contains  approximately 
nj  =  u/1 0  fitted  values,  where  /  =  1, 2, . . . ,  1 0  .  The  groups  are  tabulated  in  order  of 
increasing  fitted  value.  That  is,  there  are  subjeets  with  the  smallest  fitted  values  in 
group  1,  while  there  are  njo  subjects  with  the  largest  fitted  values  in  group  10.  The 

groups  serve  as  the  columns  of  a  2  x  10  table,  where  the  rows  are  denoted  by  the  two 
possible  values  of  the  dichotomous  response.  For  the  _y  =  1  row,  the  expected 
frequencies  for  each  group  are  computed  as  follows: 
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and  the  Hosmer-Lemeshow  statistic,  C ,  is  given  by 


(2.23) 


The  use  of  10  groups  is  not  universal.  If  the  number  of  distinct  covariate  values  is  small 
or  very  large,  then  adjusting  the  number  of  groups  may  be  necessary.  According  to 
Hosmer  and  Lemeshow,  the  use  of  10  groups  provides  an  adequate  approximation  to  the 
chi-square  distribution  in  most  applications  (Hosmer  and  Lemeshow,  2000:148-149).  In 
this  case,  the  Hosmer-Lemeshow  statistic  is  distributed  chi-square  with  10-2  =  8 
degrees  of  freedom. 
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Diasnostic  Measures. 


As  with  linear  regression,  leverage  values  for  logistie  regression  are  also  derived 
from  a  hat  matrix,  H .  Let  V  be  a  J  x  J  diagonal  matrix  whose  f'  diagonal  element  is 
given  by 

Let  the  design  matrix, X ,  be  the  Jx (A:  + 1)  matrix  containing  all  distinct  covariate 
values.  The  hat  matrix  is  defined  by 

H  =  V*'"X(X^VX)^‘  X^V*'"  (2.24) 

It  follows  that  the  hat  matrix  in  equation  (2.24)  is  also  of  dimension  Jx  J  (Hosmer  and 
Lemeshow,  2000;  168).  The  leverage  values  are  the  diagonal  elements,  h  - ,  of  the  hat 

matrix.  Instead  of  plotting  the  leverage  values  versus  the  fitted  values,  it  is  more  useful 
to  plot  the  fitted  values  against  three  different  measures. 

The  standardized  Pearson  residual  is  central  to  each  of  the  three  measures. 
Recalling  the  Pearson  residual  from  equation  (2.19),  the  standardized  Pearson  residual  for 
the  y'*  distinct  covariate  value  is 

L,  for7  =  l,2,...,J.  (2.25) 

-hj 

A  useful  measure  resulting  from  equation  (2.25)  is  the  standardized  difference  between 
P  and  ,  where  is  the  maximum  likelihood  estimates  of  the  model  coefficients 

with  the  m.  observations  for  the  distinct  covariate  value  removed.  This  measure, 
denoted  ls.p. ,  is  expressed  as 
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(2.26) 


iib 


(Hosmer  and  Lemeshow,  2000;  173).  Letting  dj  be  the  devianee  of  the  model  with  the 
nij  observations  for  the  distinet  eovariate  value  removed,  the  differenee  in  deviance, 
ISJDj ,  is  given  by 

A£)  =d^+ — ^  ^  12  271 

The  change  in  the  value  of  the  Pearson  chi-square  statistic  is  shown  to  be  equal  to  the 

square  of  the  standardized  Pearson  residual  of  equation  (2.25). 

2 

0 

l-A, 

Distinct  covariate  values  that  are  inadequately  fitted  can  be  identified  by  large  values  of 
ISDj ,  ,  or  both.  Large  values  of  ls.j3j  indicate  influence  points.  That  is,  distinct 

covariate  values  that  exert  a  significant  amount  of  influence  on  the  estimated  values  of 
the  model  coefficients  (Hosmer  and  Lemeshow,  2000:174). 


(2.28) 


Testins  Sisnificance  of  Individual  Coefficients. 

The  likelihood  ratio  test,  G,  is  a  test  of  the  hypothesis  that  all  of  the  model 
coefficients  are  zero.  It  is  statistically  equivalent  to  the  global  F  test  in  linear  regression. 
The  Wald  test,  W,  is  statistically  equivalent  to  the  partial  F  test  in  linear  regression.  It 
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assesses  the  individual  significance  of  the  f'  covariate.  The  null  and  alternative 
hypotheses  for  the  coefficient  are  given  by 


(2.29) 


For  a  multivariate  model,  G  can  be  computed  by  subtracting  the  deviance  of  the 
model  containing  the  f"  variable  from  the  deviance  of  the  model  that  does  not  contain 
the  f’'  covariate.  Because  the  likelihood  for  the  saturated  model  is  included  in  both 
deviance  calculations,  G  is  typically  expressed  as  two  times  the  natural  log  of  the 
likelihood  ratio  between  the  model  containing  the  covariate  and  the  model  that  does 
not  contain  the  f'  covariate. 

In  the  univariate  case,  the  expected  value,  or  probability  of  success,  of  the  model 
that  does  not  contain  the  single  covariate  becomes  a  simple  proportion,  or  the  ratio  of  the 
frequency  of  observations  where  y  =  1  to  the  total  number  of  observations  n.  Similarly, 
the  probability  of  failure  becomes  a  ratio  of  the  frequency  of  observations  where  y  =  0  to 
the  total  number  of  observations.  Thus,  the  likelihood  function  for  the  model  that  does 
not  contain  the  covariate  is(nj/n)"'  (ng/n)”" ,  where  T;  ,  «0  =  1(1  -yj,  and 

y  =  1 .  The  likelihood  ratio  test  statistic  G  then  becomes 


n^;'(i-^,)‘ 

G  =  21n  ^ - 


n  \  n 


(2.30) 


Further  simplifying  equation  (2.30)  yields  an  expression  in  which  the  outputs  from 
MINITAB  can  easily  be  substituted. 
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(2.31) 


\  M 

Since  this  is  a  test  for  the  signifieanee  of  a  eovariate,  rather  than  a  test  for  model 
adequacy,  the  test  statistic  G  is  distributed  chi-square  with  one  degree  of  freedom.  The 
rejection  region  criteria  are 

G<zl,i,  fail  to  rej ect  ,  or 


covariate  is  significant. 

For  multivariate  models,  rejection  of  the  null  hypothesis  implies  that  at  least  one  of  the 
covariates  is  significant.  Additional  hypothesis  tests  are  needed  to  determine  which 
one(s).  One  might  also  use  the  p-value  approach  to  evaluate  the  significance  of  a 
covariate.  That  is,  iiP{^X\  >  G^<a ,  then  sufficient  evidence  exists  to  imply  the 


significance  of  the  covariate  under  test  (Hosmer  and  Lemeshow,  2000:14-15). 

The  Hessian  matrix,  or  the  (A:  +  l)x(A:  +  l)  matrix  of  second  partial  derivatives  of 

equation  (2.14),  is  derived  to  support  the  Wald  test.  The  quantities  of  interest  are  the 
diagonal  elements  of  the  negative  inverse  of  the  Hessian,  which  are  evaluated  at  the 

maximum  likelihood  estimators  0  .  The  square  roots  of  these  diagonal  elements  are  the 
standard  errors  of  the  coefficients  of  equation  (2.4),  which  MINITAB  computes 
automatically.  The  Wald  test  statistic,  W,  under  the  null  hypothesis  in  equation  (2.29)  is 


W  = 


(2.32) 


where  j  denotes  the  standard  error  of  the  y'*  regression  coefficient.  Two  methods 
can  be  used  to  compare  W,  but  MINITAB  uses  a  p-value  approach.  The  Wald  statistic 
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can  be  squared  and  eompared  to  a  chi-square  distribution  with  one  degree  of  freedom,  as 
with  the  likelihood  ratio  test.  MINITAB  examines  a  probability  taken  from  the  standard 

normal  distribution.  That  is,  if  T’(|z|  >W^<a  ,  then  the  eovariate  ean  be  said  to 

eontribute  signifieantly  to  the  model  (Montgomery,  Peek,  and  Vining,  2001:458). 

Confidence  intervals  (CIs)  on  both  the  estimated  model  parameters  and  the  odds 
ratios  ean  be  eomputed.  A  Cl  provides  a  degree  of  assuranee  about  the  aeeuraey  of  a 
maximum  likelihood  estimate  (MLE).  The  narrower  the  range  of  the  Cl,  then  the  higher 
the  eonfidenee  is  that  the  MLE  closely  approximates  the  true  parameter  value. 

MINITAB,  however,  only  outputs  CIs  for  the  estimated  odds  ratios.  Consequently,  only 
the  proeedures  for  eonstrueting  CIs  on  odds  ratios  are  deseribed  here,  but  inferenees  for 
CIs  on  the  model  coeffieients  can  easily  be  made.  MINITAB  eonstructs  95%  CIs  by 
default.  Thus,  at  the  a  =  0.05  level  of  signifioanee,  a  95%  Cl  on  the  odds  ratio  is 
expressed  as 

exp(^^.± 

^1-0.05/2  ^  se(fi,))  (2.33) 

(Hosmer  and  Lemeshow,  2000:52-53). 


Multinomial  Logistic  Regression. 

When  the  foeus  of  a  war  termination  study  is  plaeed  on  the  methods  by  whieh 
wars  end,  rather  than  on  the  winners  and  losers  of  wars,  examination  of  Pillar’s  analyses 
alone  show  the  response  variable  of  interest  to  eontain  more  than  two  eategories,  or 
methods  of  ending  wars.  Henee,  binary  logistic  regression  cannot  be  used  to  analyze  this 
situation  beeause  the  response  is  polyehotomous,  rather  than  diehotomous.  Modifieations 
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to  the  binary  logistic  regression  model  were  made  in  1974,  and  the  result  was  the 
multinomial  logistic  regression  model  (Hosmer  and  Lemeshow,  2000:260).  The  term 
multinomial  is  used  because  the  outcome  variable,  or  type  of  war  ending,  is  said  to  be 
nominal.  This  follows  from  the  fact  that  types  of  war  endings  cannot  be  ordered  in  any 
statistically  meaningful  way  (Hosmer  and  Lemeshow,  2000:260). 

The  simplest  way  to  demonstrate  the  theory  behind  multinomial  logistic 
regression  is  through  the  case  where  the  response  contains  />  =  3  categories,  though 
extensions  of  the  model  can  easily  be  made  for  responses  containing  more  than  three 
categories.  Let  the  categories  of  the  response  variable,  Y,  be  coded  as  0,  1,  and  2. 
MINITAB,  however,  allows  the  response  code  to  begin  with  1,  rather  than  0.  For  any 
response  with  p  categories,  a  reference  category  must  be  selected,  to  which  the  remaining 
p-\  categories  are  compared.  The  7  =  0  category  is  selected  as  the  reference  category 
for  explaining  multinomial  logistic  regression  theory  here,  which  is  the  same  assumption 
made  by  Hosmer  and  Lemeshow  (Hosmer  and  Lemeshow,  2000:261). 

While  binary  logistic  regression  makes  use  of  only  one  logit  function, 
multinomial  logistic  regression  produces  p-\  logits.  Each  logit  is  expressed  as  the 
natural  logarithm  of  a  ratio  of  conditional  probabilities.  In  general,  the  conditional 
probability  for  the  f''  response  category  given  x ,  where  x  is  a  vector  of  k  covariates  plus 
a  constant  term,  is  given  by 

exp(g,(x)) 

l  +  ^exp(g.(x)) 

/=1 

The  f'  logit,  for  which  MLEs  are  computed,  is  denoted  as 


^y(x)  (2.34) 
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(2.35) 


S,(x)  =  ln 


[p(F  =  0|x)J 


-  Ao  +  Al-^1  ^2-^2  +  •  •  •  +  Pjk^k 


where  j  =  \,2,...,p-\.  It  follows  that  the  logit  for  Y  =  i  versus  Y  =  j  can  be  computed 
by 


=  ln 


'p(7  =  /|x)^ 


In 


[p(7  =  0|x)J 


=  ln 


"p(7  =  /|x)  P(7  =  0|x)^ 
^P(7  =  0|x)P(V  =  7|x)^ 


=  ln 


'p(7  =  /|x)^ 


=  X 


T 


(2.36) 


For  the  purpose  of  clarifying  the  likelihood  function,  the  response  is  coded  using 
indicator,  or  dummy,  variables  (Montgomery,  Peck,  and  Vining,  2001:265).  A p- 
category  response  can  be  coded  using  p  dummy  variables  as  follows: 


If  7  =  0 ,  then  =  1 ,  Vj  =  0 ,  Vj  =  0 , . . . ,  =  0  . 

If  7  =  1 ,  then  =  0 ,  Vj  =  1 ,  Vj  =  0 , . . . ,  =  0  . 

If  7  =  2 ,  then  Vg  =  0 ,  Vj  =  0 ,  Vj  =  1 , . . . ,  =  0  . 


If  7  =  /» - 1 ,  then  Vg  =  0 ,  Vj  =  0 ,  Vj 


=  1. 
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p-1 

^ =1  for  any  i  =  \,2,...,n . 

j=0 

Letting  tTj-  denote  the  f'  eonditional  probability  funetion  eorresponding  to  the  response 
from  the  i‘^  observation,  and  letting  gj.  denote  the  logit  eorresponding  to  the 
response  from  the  i"'  observation,  the  eonditional  likelihood  funetion  takes  the  form 


'(-»)= n( 


•7t, 


(p-1)' 


)■ 


It  follows  that  the  log-likelihood  funetion  is 


L{P)  =  Y.^ugu^^2ig2i+---  +  \p^X)ig(p^l)i-M^  +  e 


,gl,  ^  gg2,  _(_  . 


•  +  e' 


(2.37) 


1=1 

Taking  first  partial  derivatives  yields  {^p-\){k  +  \)  likelihood  equations.  This  result  is 
shown  by  noting  that  a /^-eategory  response  produees  p-l  logits,  eaeh  eontaining  k  +  \ 
parameters.  As  with  binary  logistie  regression,  setting  the  likelihood  equations  to  zero 
and  solving  for  ft  gives  the  MLEs,  ,  whieh  are  again  obtained  via  the  IRLS  proeedure 
(Hosmer  and  Lemeshow,  2000:262-263). 

Interpretation  of  the  parameters  is  similar  to  that  of  the  binary  model.  There  are 
k{^p-\)  odds  ratios  to  eompute,  in  whieh  eaeh  of  the  remaining  p-\  response  values  is 

eompared  to  the  referenee  eategory.  It  is  assumed  here  that  the  referenee  outeome  is 
7  =  0,  but  MINITAB  allows  the  seleetion  of  any  eategory  as  the  referenee.  For  a 
eontinuous  eovariate,  the  odds  ratio  eomparing  7  =  y  to  7  =  0  assoeiated  with  a  one-unit 
ehange  in  x,  is  expressed  as 
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(2.38) 


P{Y  =  j\X^x)/ 

/p[Y  =  0\X  =  x) 
P[Y  =  j\X  =  x±\)/ 

/P[Y  =  Q\X  =  x±\) 


(Hosmer  and  Lemeshow,  2000:265). 

Calculations  for  the  likelihood  ratio  statistie  are  similar  to  those  for  the  binary 
logistie  regression  model.  The  differenee  lies  in  the  degrees  of  freedom  assoeiated  with 
it.  For  a  eontinuous  eovariate,  the  likelihood  ratio  statistie,  G,  is  distributed  ehi-square 
with  p-\  degrees  of  freedom.  For  a  eategorieal  eovariate,  also  ealled  a  faetor,  the 

degrees  of  freedom  beeome  - 1)  ( Pf  - 1) ,  where  p^  is  the  number  of  eategories  in  the 


response,  and  pj^  is  the  number  of  eategories  in  the  faetor  (Hosmer  and  Lemeshow, 
2000:270). 

Hosmer  and  Lemeshow  note  that  ideas  for  extending  diagnostie  measures  into 
multinomial  models  have  been  proposed.  Current  statistieal  software  paekages,  however, 
have  not  ineorporated  sueh  proposals  beeause  the  measures  involved  are  eomputationally 
intensive  (Hosmer  and  Lemeshow,  2000:281).  As  a  result,  diagnostie  measures  and  plots 
were  not  generated  for  the  multinomial  models  on  interstate  wars  in  this  study.  The  odds 
ratios,  goodness-of-fit  tests,  and  likelihood  ratio  tests  were  eonsidered  suffieient  to 
aehieve  the  overarehing  goal  of  demonstrating  the  applieability  of  multinomial  logistie 
regression  to  war  termination  investigations. 
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Summary 


This  thesis  seeks  to  define  probabilistie  relationships  between  the  outeomes  or 
winners  of  wars  and  a  single  or  group  of  explanatory  variables.  Construeting  the  best 
deseriptive  and  most  parsimonious  models  from  the  available  open-souree  data  is  also 
desired.  Logistie  regression  teehniques  provide  readily  interpretable  ways  of  defining 
sueh  relationships.  Beeause  war  is  a  eomplex  endeavor  and  the  eonduet  of  war  is  highly 
dynamie,  the  termination  of  war  is  deseribed  best  through  eonditional  probabilities  and 
likelihoods.  The  results  of  logistie  regression  ean  also  provide  additional  insights  into 
what  levels  of  whieh  explanatory  variables  are  either  neeessary  or  aeeeptable  in  order  to 
either  aehieve  a  partieular  war  ending  or  emerge  vietorious  from  a  war. 
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Ill,  Methodology 


Rationale 

The  goal  of  this  research  is  to  investigate  and  define,  if  possible,  relationships 
between  several  independent  variables  and  either  the  winner  of  a  war  or  the  manner  in 
which  a  war  ends.  Given  the  qualitative  nature  of  the  dependent  variables  of  the  selected 
data  sets,  a  logistic  regression  approach  is  the  preferred  method  to  model  such 
relationships.  The  dependent  variable  is  commonly  called  the  response,  and  the 
independent  variables  are  called  covariates  (Hosmer  and  Lemeshow,  2000:1). 

Unlike  linear  regression,  the  response  for  each  data  set  is  categorical.  For  the 
interstate  wars  set,  the  response  is  denoted  by  the  variable  Outcome.  For  both  the 
intrastate  wars  and  extra-state  wars  sets,  the  response  is  denoted  by  the  variable  Winner. 
Each  of  the  response  variables  is  nominal.  That  is,  no  natural  ordering  of  its  categories 
exists,  and  numerical  differences  between  categories  are  meaningless.  Each  response 
contains  six  categories,  so  the  resulting  model  is  called  a  polychotomous  or  multinomial 
logistic  regression  model  (Fiosmer  and  Eemeshow,  2000:260).  The  term  multinomial  is 
preferred  in  this  thesis. 


Variable  Selection 

The  data  set  concerning  participants  in  interstate  wars  initially  contained  28 
variables.  These  variables  and  their  COWP  definitions  are  given  in  Appendix  A.  The 
COWP  assigned  a  unique  number  to  each  participant,  called  a  country  code,  so  it  was 
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assumed  that  neither  the  eountry  eode  nor  the  three-letter  eountry  abbreviation  needed  be 
ineluded  in  the  final  data  set.  The  initial  set  also  eontained  variables  for  the  days, 
months,  and  years  in  whieh  the  individual  wars  began  and  ended.  The  COWP  ineluded  a 
seeond  set  of  date  eolumns  for  those  wars  in  whieh  there  was  a  short  break  in  the 
fighting,  but  the  war  started  up  again.  Existing  war  termination  literature  does  not  appear 
to  emphasize  the  importanee  of  dates.  It  was  therefore  believed  that  these  variables  were 
unneeessary  for  the  analysis,  so  the  date  columns  were  not  added  to  the  final  data  set.  A 
similar  assumption  was  made  about  the  variables  concerning  the  geographic  location  of 
the  wars,  although  this  may  be  an  area  for  future  investigation.  Ultimately,  five  variables 
were  retained  for  analysis:  the  outcome  of  the  war  for  the  participating  nation,  the 
duration  of  the  war  in  days,  the  participating  nation’s  population  at  the  war’s  outset,  the 
participating  nation’s  military  manpower  at  the  war’s  outset,  and  the  number  of  combat 
deaths  sustained  during  the  war  by  the  participating  nation.  Identical  assumptions  were 
made  for  both  the  extra-state  and  intrastate  war  sets,  and  the  same  five  variables  were 
retained.  However,  the  response  variable  was  defined  by  who  won  the  conflict,  rather 
than  how  the  conflict  ended. 


Variable  Translation 

Any  nation,  past  or  present,  has  or  has  had  the  potential  to  engage  in  armed 
conflict.  Some  nations  are  small,  and  some  are  considered  superpowers.  Therefore,  it  is 
not  sufficient  to  analyze  the  raw  data.  Measures  that  adequately  describe  the  entire 
population  of  belligerents  are  needed.  Expressing  the  casualty,  population,  and  armed 
forces  data  as  proportions  was  believed  to  yield  more  meaningful  and  interpretable  results 
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than  the  raw  numbers.  Three  proportions  were  eomputed  for  eaeh  observation  in  eaeh 


data  set, 


%  _  Casualties  =  ^  (3.1) 

Tot  _  Deaths 

Deaths  /  Pop  %  =  ^ (3 .2) 
PWarPop 

Deaths  /  Arm  _  %  =  ^  (-33^ 

P  War  Arm 

where  C  Deaths  is  the  number  of  casualties  sustained  by  the  participant  during  the  war, 
Tot_Deaths  is  the  sum  of  casualties  sustained  by  all  belligerents  during  the  war, 

PWarPop  is  the  participating  nation’s  population  at  the  start  of  the  war,  and  PWarArm  is 
the  size  of  the  participating  nation’s  armed  forces  at  the  start  of  the  war. 

An  attempt  was  made  to  create  a  proxy  measure  of  the  economic  costs  of  wars 
and  include  such  a  measure  in  the  multinomial  logistic  regression  model.  This  proxy 
measure  was  derived  from  other  data  sets  compiled  by  the  COWP.  In  their  National 
Material  Capabilities  (NMC)  data  set,  the  COWP  included  yearly  observations  of  military 
expenditures,  in  millions  of  2001  US  dollars  (USD).  The  variables  for  this  set  and  their 
definitions  are  given  in  Appendix  C. 

For  each  war  participant,  the  average  amount  of  military  expenditures,  denoted  as 
Avg_Milex,  was  computed  for  the  duration  of  each  war.  The  desire  was  to  take  that 
average  and  divide  it  by  the  average  gross  domestic  product  (GDP)  for  each  war 
participant  during  each  war,  which  would  have  given  a  proxy  measure  for  the  degree  to 
which  a  nation’s  industrial  capacity  is  consumed  by  war.  Unfortunately,  GDP  figures 
could  not  be  obtained  for  wars  occurring  earlier  than  1870,  and,  of  the  GDP  estimates 
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available,  not  enough  countries  contained  GDP  observations  to  cover  the  number  of 
participants  in  the  interstate  wars  data  set.  It  should  be  noted  that  while  GDP  figures 
might  be  obtained  from  other  sources,  one  secondary  objective  of  this  study  was  only  to 
use  data  from  the  same  open  source,  the  COWP.  As  a  result,  another  more  available 
proxy  economic  indicator  was  used.  The  COWP,  in  its  data  set  on  national  trade, 
compiled  total  trade  estimates  for  each  of  the  countries  in  the  interstate  wars  set. 

Avg  Milex 

avgME  as  PIT  = - -  (3.4) 

Avg  _TTrade 

The  COWP  computed  total  trade  as  a  sum  of  a  nation’s  total  imports  and  total 
exports  for  a  given  year,  all  in  2001  USD.  For  each  war  participant,  the  average  total 
trade,  AvgJTTrade,  was  computed  for  the  duration  of  each  war,  and  this  amount  was  used 
as  the  divisor  in  lieu  of  average  GDP.  This  proxy  measure  was  defined  as  the  average 
amount  of  military  spending  as  a  proportion  of  the  average  total  trade  for  the  war. 

Without  consistent  GDP  estimates,  this  measure  was  proposed  as  the  best  economic 
activity  indicator  available  for  this  analysis. 

The  category  definitions  for  the  response  Outcome  in  the  interstate  wars  data  set 
were  revised  from  those  given  by  the  COWP,  which  are  given  in  Table  1.  Determining 
the  likelihood  of  one  type  of  outcome  over  another  was  assumed  to  be  more  important  to 
this  study  than  knowing  on  which  side  a  given  country  participated,  so  the  new 
definitions  were  created  by  comparing  the  COWP  definitions  to  those  of  Paul  Pillar’s 
classifications.  The  revised  response  categories  for  the  interstate  wars  data  are  given  in 
Table  2.  In  contrast,  the  response  categories  for  the  intrastate  and  extra-state  sets  did  not 
require  revision,  and  the  next  section  explains  this  case. 
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Table  1:  COWP  Outcomes  for  Interstate  Wars 


Category 

COWP  Definition 

1 

On  Winning  Side 

2 

On  Losing  Side 

3 

On  Side  A  of  a  Tie 

4 

On  Side  B  of  a  Tie 

5 

On  Side  A  of  an  Ongoing  War 

6 

On  Side  B  of  an  Ongoing  War 

For  the  cases  where  either  a  total  military  conquest,  which  Pillar  calls 
extermination  or  expulsion,  or  an  imposed  settlement  ends  a  war,  it  was  assumed  that  the 
victor’s  military  force  was  the  dominant  factor.  That  is,  the  winning  side  inflicts  military 
defeats  upon  his  enemy  to  such  an  extent  that  his  enemy  must  give  up  the  fight  through 
either  unconditional  surrender  or  capitulation  to  terms  imposed  upon  him  during  an 
armistice  or  cease-fire.  These  cases  were  subsequently  defined,  and  thus  categorized,  as 
victory  through  military  imposition  (Pillar,  1983:14). 

The  converse  of  the  aforementioned  definition  was  assumed  to  be  true  when 
considering  a  martially  defeated  nation.  The  losing  country  agrees  to  the  demands  of  the 
victor,  no  matter  in  what  manner  such  an  agreement  occurs.  Pillar’s  description  of  this 
type  of  situation  was  considered  accurate,  so  this  category  was  called  capitulation  (Pillar, 
1983:15). 

Defining  the  cases  where  no  clear  victor  exists,  or  where  a  clear  military  victor 
emerges  without  the  capitulation  of  the  defeated,  is  difficult.  Pillar  refers  to  a  mutual 
withdrawal  of  military  forces,  either  with  or  without  an  agreement  (Pillar,  1983:14). 
However,  in  order  to  distinguish  from  a  negotiation,  it  is  assumed  that  fighting  ceases 
without  any  resolution  of  the  issues  over  which  the  war  was  waged.  The  circumstances 
surrounding  some  cease-fires  and  armistices  may  cause  them  to  fall  into  this  category. 
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such  as  those  of  the  eease-fires  between  Israel  and  one  or  more  of  the  Arab  states  in  1949, 
1956,  1967,  1973,  1982,  and  2006  (Pillar,  1983:22-23).  These  eases  constitute 
stalemates. 

Another  diffieulty  arose  with  the  few  observations  where  the  participants  began 
fighting  a  small  war,  but  either  the  eonfiiet  grew  into  a  major  war  through  third-party 
intervention,  or  the  partieipants  joined  allies  in  a  larger  war  to  fight  for  different  aims. 
Pillar  ealls  this  absorption  (Pillar,  1983:14).  Beeause  there  were  so  few  of  these  eases  in 
the  data  set,  each  observation  exhibiting  this  result  was  examined  to  find  conditions  that 
would  allow  it  to  be  placed  in  a  previously  defined  eategory.  Such  conditions  existed  in 
some  of  the  observations,  but  not  in  all.  Since  the  sample  size  for  the  interstate  wars  set 
was  larger  than  200  observations,  it  was  assumed  that  the  two  observations  fitting  the 
aforementioned  deseription  would  inflate  the  range  of  the  CIs  for  the  resulting  odds 
ratios,  so  the  two  observations  were  omitted  from  the  data  set. 

When  imposition,  eapitulation,  or  a  stalemate  does  not  oeeur,  then  the  possibility 
exists  for  a  mutual  agreement  between  all  belligerents  to  occur.  Such  an  agreement  is  not 
one-sided,  but  rather  all  sides  make  eoneessions  in  order  to  form  a  paet  about  whieh  all 
can  be  satisfied.  In  sueh  situations  of  eompromise,  it  is  assumed  that  some  form  of 
negotiation  between  opposing  nations  must  take  plaee  (Pillar,  1983:15).  Unlike  Pillar, 
who  makes  a  distinetion  between  agreements  between  belligerents  and  third-party 
mediations,  the  faet  that  a  eompromise  is  struek  is  assumed  to  be  more  important  than  the 
manner  in  whieh  it  is  struck. 

The  COWP  also  eompiled  a  data  set  eoneerning  international  disputes,  ealled 
Militarized  Interstate  Disputes  (MID).  The  variables  for  the  MID  set  and  their  COWP 
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definitions  are  given  in  Appendix  B.  The  subset  of  the  MID  set  where  the  disputes 
resulted  in  wars  matehed  exaetly  to  the  observations  in  the  interstate  wars  set.  The 
advantage  to  this  was  that  the  values  for  the  outeome  and  settlement  variables  in  the  MID 
subset  could  be  directly  compared  to  the  corresponding  values  for  the  response  in  the 
interstate  wars  set.  The  purpose  of  this  comparison  was  to  distinguish  between  those 
participants  who  benefited  the  most,  or  won,  through  a  negotiated  settlement  and  those 
participants  who  gained  the  least,  or  lost,  through  a  negotiated  settlement.  That  is,  those 
observations  whose  MID  outcome  was  a  compromise  and  settlement  was  negotiated,  but 
whose  interstate  wars  outcome  was  a  victory,  are  coded  under  the  category  of  victory  by 
negotiated  settlement.  Those  observations  whose  MID  outcome  was  a  compromise  and 
settlement  was  negotiated,  but  whose  interstate  wars  outcome  was  a  yield,  are  coded 
under  the  category  of  defeat  by  negotiated  settlement. 


Table  2:  Revised  Outcomes  for  Interstate  Wars 


Category 

Revised  Definition 

1 

Victory  by  Military  Imposition 

2 

Capitulation 

3 

Stalemate 

4 

Victory  by  Negotiated  Settlement 

5 

Defeat  by  Negotiated  Settlement 

Data  Compression 

The  next  obstacle  was  to  deal  with  any  missing  data  for  each  set.  Each  variable 
had  missing  entries,  but  not  all  of  the  missing  entries  occurred  in  the  same  observation. 
Several  statistical  techniques  could  have  been  used  to  fill  in  the  missing  entries,  but  the 
sample  sizes  for  each  set  remained  sufficiently  large  with  the  observations  corresponding 
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to  the  missing  entries  omitted.  The  rule  of  10,  as  discussed  by  Hosmer  and  Lemeshow, 
was  used  to  justify  eliminating  the  missing  data  points  from  the  final  sets  (Hosmer  and 
Lemeshow,  2000:346-347). 

The  objective  of  the  rule  of  10  is  to  determine  the  number  of  observations  per 
estimated  parameter  needed  to  avoid  poor  model  variance  estimates.  Reviewing  the 
observations  per  parameter  also  allows  the  flexibility  to  postulate  higher-order  models,  as 
opposed  to  main  effects  models  only.  Hosmer  and  Lemeshow  use  the  quantity 
m  =  min( «],«())  ,  where  nj  and  no  are  the  frequencies  of  those  observations  yielding 

responses  of  1  and  0,  respectively.  However,  the  above  quantity  is  assuming  the  use  of  a 
typical  dichotomous,  or  binomial,  logistic  regression  model,  where  the  outcome  can  only 
assume  one  of  two  values  (Hosmer  and  Lemeshow,  2000:346). 

The  response  Outcome  in  the  interstate  wars  set  contains  five  categories,  so  the 
quantity  used  by  Hosmer  and  Lemeshow  is  revised  to  reflect  a  multinomial  logistic 
regression  model. 

m  =  min(uQ,nj,U2,n3,«4)  (3.5) 

For  equation  (3.5),  uois  the  number  of  wars  where  the  participant  wins  by  military 
imposition,  ni  is  the  number  of  wars  where  the  participant  loses  through  capitulation,  n2 
is  the  number  of  wars  ending  by  stalemate,  is  the  number  of  wars  where  the  participant 
wins  through  a  negotiated  settlement,  and  is  the  number  of  wars  where  the  participant 
loses  through  a  negotiated  settlement.  After  eliminating  the  observations  containing 
missing  data,  225  observations  remained.  The  least  frequent  response  was 
m  =  min(87, 53, 3 1, 26, 28)  =  26 ,  or  a  victory  through  a  negotiated  settlement.  For  A: 
covariates,  Hosmer  and  Lemeshow  suggest  that  k  +  \<m  / 10  parameters  be  included  in 
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the  model,  where  A:  + 1  is  the  number  of  covariates  plus  an  intercept  term  (Hosmer  and 
Lemeshow,  2000:346).  No  more  than  26/10  =  2.6  «  2  parameters  should  be  included  in 
the  interstate  wars  model,  which  corresponds  to  a  univariate,  or  single-variable  main 
effects,  model. 

For  both  the  extra-state  and  intrastate  wars  sets,  when  the  observations  containing 
missing  data  were  eliminated,  their  respective  response  categories  reduced  to  the 
binomial  case.  That  is,  the  remaining  response  values  corresponded  to  either  the  state 
winning  or  the  non-state  actor  or  insurgency  winning.  Table  3  shows  the  resulting 
categories  and  definitions  for  both  the  extra-state  and  intrastate  data  sets. 

Table  3:  Winner  Categories  for  Extra/Intrastate  Wars 


Category 

Definition 

1 

State  Wins 

2 

Non-State  Actor/Insurgency  Wins 

Let  m  1  be  the  smaller  frequency  for  the  intrastate  data  set,  and  let  m2  be  the 
smaller  frequency  for  the  extra-state  data  set.  For  the  intrastate  wars, 
m,  =  min(49, 24)  =  24 ,  so  the  model  should  contain  no  more  than 
24/10  =  2.4  »  2  parameters,  which  again  corresponds  to  a  univariate  main-effects  model. 
For  the  extra-state  wars,  Wj  =  mm(40,19)  =  19  ,  so  its  model  should  have 

19/10  =  1. 9  «1  parameter,  which  would  exclude  any  covariates  and  contain  only  a 
constant  term. 

It  should  be  noted  that  the  rule  of  10  is  not  absolute.  Fiosmer  and  Lemeshow 
insist  that  it  be  used  as  a  guideline  only.  Other  considerations  must  be  made,  such  as  the 
balance  of  the  distribution  of  the  covariates,  total  sample  size,  and  any  previously  stated 
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requirements.  If  the  distribution  of  multinomial  response  is  skewed  towards  one  eategory 
or  a  subset  of  its  eategories,  then  the  applieability  of  the  rule  of  10  could  be  questionable 
(Hosmer  and  Lemeshow,  2000:347).  Skewed  response  variables  were  present  in  each  of 
the  three  data  sets  analyzed.  Therefore,  first-order  main-effects  models  including  all 
retained  covariates  were  postulated  initially  for  each  data  set  such  that  the  usefulness  of 
the  rule  of  10,  at  least  in  this  case,  could  be  determined. 


Unit  Normal  Scaling 

Unit  normal  data  scaling  was  used  to  aid  in  interpretation  of  the  odds  ratios  for  the 
fitted  models.  Unlike  the  responses,  the  covariates,  once  translated  into  proportions,  were 
continuous,  so  it  was  assumed  that  each  was  approximately  normally  distributed.  The 
idea  of  a  single-unit  change  in  each  covariate  needed  to  be  defined  as  well.  Unit  normal 
scaling  provided  these  definitions. 

This  technique  involves  transforming  a  normal  random  variable  into  a  standard 
normal  random  variable.  For  /  =  1, 2, . . . , n ;  and  for  j  =  \,2,...,k ;  the  observation  of 
the  f'  covariate  is  expressed  as 


Xy-X. 


where  the  sample  variance  of  the  f'  covariate  is  given  by 


2  i=\ 

^J=  — 


n  ^ 


n-\ 


(3.6) 
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and  the  sample  mean  of  the  covariate  is 


1  n 

As  with  the  standard  normal  distribution,  each  scaled  covariate  has  a  mean  of  0  and  a 
standard  deviation  of  1  (Montgomery,  Peck,  and  Vining,  2001:1 13). 


Trend  Recognition 

The  observations  in  each  of  the  three  data  sets  analyzed  covered  nearly  two 
centuries  of  warfare,  from  as  early  as  1816  to  as  late  as  1997.  In  addition  to  the  obvious 
improvements  in  weapons  and  subsequent  shifts  in  tactics,  the  question  of  whether  or  not 
similar  shifts  in  war  termination  patterns  could  be  found  was  addressed.  In  order  to 
identify  such  pattern  changes,  subsets  of  each  data  set  needed  to  be  analyzed,  which 
prompted  another  question.  How  should  the  data  be  divided? 

Two  methods  of  data  division  were  considered.  Since  the  data  covered  two 
centuries  of  war,  a  proposed  dividing  line  was  the  year  1900.  That  is,  all  observations 
occurring  before  1900  would  be  used  to  fit  one  model,  while  all  observations  occurring  in 
1900  and  after  would  be  used  to  fit  a  separate  model.  This  division  method  could 
account  for  weapons  technology  changes  between  the  nineteenth  and  twentieth  centuries. 
Dividing  the  data  by  major  shifts  in  tactics,  such  as  the  switch  from  Napoleonic-style 
combat  to  smaller  squad-level  maneuvers,  was  another  proposal.  Typically,  though  not 
immediately,  improvements  in  weapons  technology  necessarily  prompt  changes  in  how 
weapons  are  employed  in  war.  While  certainly  open  to  historical  debate,  the  Spanish- 
American  War  of  1898  was  assumed  to  be  the  transition  point  from  Napoleonic  warfare 
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to  modem,  or  mechanized,  warfare.  Ultimately,  the  composition  of  the  data  sets  allowed 
divisions  such  that  both  changes  in  century  and  changes  in  tactics  could  be 
simultaneously  examined. 

Using  the  above  method,  MfNITAB  was  used  to  fit  three  logistic  regression 
models  to  each  of  the  three  war  data  sets.  Multinomial  logistic  regression  was  employed 
for  the  interstate  wars  set,  where  the  response  Outcome  contained  five  categories. 
Compressing  and  translating  the  data  from  both  the  intrastate  and  extra-state  sets  allowed 
the  use  of  binary  logistic  regression,  with  Winner  as  the  response  in  both  cases.  The  first 
models  for  each  set  were  fit  using  the  aggregated  data  in  each  set.  The  second  models 
were  fit  using  the  divided  data  from  the  19*  Century,  while  the  last  models  used  the 
divided  data  from  the  20*  Century. 


Variable  Nomenclature 

Different  names  were  given  to  each  response  and  covariate  for  each  data  set.  The 
variable  names  in  each  set  included  a  designator  for  the  data  scaling  technique  used,  unit 
normal  scaling  (UNS).  The  variables  names  were  additionally  distinguished  by  century. 
The  variable  names  in  the  aggregated  models,  however,  did  not  contain  century 
designators.  Table  4  contains  each  response  variable  name  included  in  this  study  and  its 
corresponding  definition.  The  names  and  definitions  for  the  extra-systemic  war 
covariates  used  in  this  study  are  shown  in  Table  5.  The  names  and  definitions  for  the 
intrastate  war  covariates  used  in  this  study  are  shown  in  Table  6.  The  names  and 
definitions  for  the  interstate  war  covariates  used  in  this  study  are  given  in  Table  7. 
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Table  4:  Response  Variable  Nomenclature 


_ Response 

Winner  ES  UNS  19 


Definition _ 

Extra-systemic  War  Winner, 
1 9  th  Century  Wars 


Winner _ESJJNS JO 


Extra-systemic  War  Winner, 
20th  Century  Wars 


Winner  ES  UNS 


Extra-systemic  War  Winner, 
Aggregated  Wars 


Winner  IS  UNS  19 


Intrastate  War  Winner, 
19  th  Century  Wars 


Winner  IS  UNS  20 


Intrastate  War  Winner, 
20th  Century  Wars 


Winner  IS  UNS 


Intrastate  War  Winner, 
Aggregated  Wars 


Outcome(PR2)19 


Outcome  of  Interstate  War, 
1 9  th  Century  Wars 


Outcome(PR2)  JO 


Outcome  of  Interstate  War, 
20th  Century  Wars 


Outcome(PR2) 


Outcome  of  Interstate  War, 
_ Aggregated  Wars 
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Table  5:  Covariate  Nomenclature  for  Extra-Systemic  Wars 


_ Covariate _ 

Dur_ES_UNS_19 

C_Dths/Pop_ES_UNS_19 

C_Dths/Arm_ES_UNSJ  9 

C_Dths/TDths_ES_UNS_l  9 

Dur_ES_UNS_20 

C_Dths/Pop_ES_UNS_20 

C_Dths/Arm_ES_UNS_20 

C_Dths/TDths_ES_UNS_20 

C_Deaths/Pop_ES_UNS 

C_Deaths/Arm_ES_UNS 

CDea  ths/TotDea  ths_ES_  UNS 

Duration  ES  UNS 


_ Definition _ 

Duration  of  19th  Century  Extra- Systemic 
War,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Population, 
19th  Century  Extra- Systemic  Wars,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Armed  Force  Size, 
19th  Century  Extra- Systemic  Wars,  Unit  Normally  Scaled 

Proportion  of  Total  Deaths  Sustained  by  Participant, 
19th  Century  Extra- Systemic  Wars,  Unit  Normally  Scaled 

Duration  of  20th  Century  Extra- Systemic 
War,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Population, 
20th  Century  Extra- Systemic  Wars,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Armed  Force  Size, 
20th  Century  Extra- Systemic  Wars,  Unit  Normally  Scaled 

Proportion  of  Total  Deaths  Sustained  by  Participant, 
20th  Century  Extra- Systemic  Wars,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Population, 
Aggregated  Extra- Systemic  Wars,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Armed  Force  Size, 
Aggregated  Extra- Systemic  Wars,  Unit  Normally  Scaled 

Proportion  of  Total  Deaths  Sustained  by  Participant, 
Aggregated  Extra- Systemic  Wars,  Unit  Normally  Scaled 

Duration  of  Aggregated  Extra- Systemic  Wars, 

_ Unit  Normally  Scaled 
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Table  6:  Covariate  Nomenclature  for  Intrastate  Wars 


_ Covariate _ 

Duration  _ISJJNS_1 9 

Dead/PopJSJJNSJ  9 

Dead/ArmJS_UNS_l  9 

C_Dead/TotDead_lS_UNS_l  9 

Duration  _ISJJNS_20 

Dead/PopJSJJNSJO 

Dead/Arm_IS_UNS_20 

C_Dead/TotDead_IS_UNS_20 

Dura  tion_In  tS_  UNS 

Dead/Pop  JntSJJNS 

Dead/ A  rm_In  tS_  UNS 

C  Dead/TotDead  IntS  UNS 


_ Definition _ 

Duration  of  19th  Century  Intrastate 
War,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Population, 
19th  Century  Intrastate  Wars,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Armed  Force  Size, 
19th  Century  Intrastate  Wars,  Unit  Normally  Scaled 

Proportion  of  Total  Deaths  Sustained  by  Participant, 
19th  Century  Intrastate  Wars,  Unit  Normally  Scaled 

Duration  of  20th  Century  Intrastate 
War,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Population, 
20th  Century  Intrastate  Wars,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Armed  Force  Size, 
20th  Century  Intrastate  Wars,  Unit  Normally  Scaled 

Proportion  of  Total  Deaths  Sustained  by  Participant, 
20th  Century  Intrastate  Wars,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Population, 
Aggregated  Intrastate  Wars,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Armed  Force  Size, 
Aggregated  Intrastate  Wars,  Unit  Normally  Scaled 

Proportion  of  Total  Deaths  Sustained  by  Participant, 
Aggregated  Intrastate  Wars,  Unit  Normally  Scaled 

Duration  of  Aggregated  Intrastate  Wars, 
_ Unit  Normally  Scaled 
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Table  7:  Covariate  Nomenclature  for  Interstate  Wars 


_ Covariate 

Dumtion_UNS_l  9 

Dths/Pop_UNS_19 

Dths/Arm_UNS_19 

MilEx/TT_UNS_19 

Dths/TDeaths_UNS_19 
Dura  tion_  UNS_2  0 
Dths/PopJJNSJO 

Dths/Arm_UNS_20 

MilEx/TTJJNSJO 

Dths/TDeaths  _UNS_20 
Duration_UNS 
Dea  ths/Pop_  UNS 

Deaths/Arm_UNS 

Deaths/TotDeaths_UNS 

MilEx/TotTrade  UNS 


_ Definition _ 

Duration  of  19th  Century  Interstate  War,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Population, 
1 9th  Century  Interstate  Wars,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Armed  Force  Size, 
19th  Century  Interstate  Wars,  Unit  Normally  Scaled 

Proportion  of  Average  State  Military  Expenditures 
(2001  USD)  to  Average  State  Total  Trade  (2001  USD), 
1 9th  Century  Interstate  Wars,  Unit  Normally  Scaled 

Proportion  of  Total  Deaths  Sustained  by  Participant, 
1 9th  Century  Interstate  Wars,  Unit  Normally  Scaled 

Duration  of  20th  Century  Interstate  War,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Population, 
20th  Century  Interstate  Wars,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Armed  Force  Size, 
20th  Century  Interstate  Wars,  Unit  Normally  Scaled 

Proportion  of  Average  State  Military  Expenditures 
(2001  USD)  to  Average  State  Total  Trade  (2001  USD), 
20th  Century  Interstate  Wars,  Unit  Normally  Scaled 

Proportion  of  Total  Deaths  Sustained  by  Participant, 
20th  Century  Interstate  Wars,  Unit  Normally  Scaled 

Duration  of  Aggregated  Interstate  Wars,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Population, 
Aggregated  Interstate  Wars,  Unit  Normally  Scaled 

Proportion  of  State  Deaths  to  Its  Pre-war  Armed  Force  Size, 
Aggregated  Interstate  Wars,  Unit  Normally  Scaled 

Proportion  of  Total  Deaths  Sustained  by  Participant, 
Aggregated  Interstate  Wars,  Unit  Normally  Scaled 

Proportion  of  Average  State  Military  Expenditures 
(2001  USD)  to  Average  State  Total  Trade  (2001  USD), 
_ Aggregated  Interstate  Wars,  Unit  Normally  Scaled 
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Stepwise  Regression 


Stepwise  regression  is  a  robust  proeedure  commonly  used  in  both  linear  and 
logistic  regression  as  a  model-building  technique.  This  is  an  effective  technique  to  use 
when  the  true  relationship  between  the  covariates  and  the  response  is  either  unknown  or 
unclear  (Hosmer  and  Lemeshow,  2000:1 16).  Stepwise  regression  was  employed  for  this 
research  because,  as  an  initial  investigation,  the  associations  within  the  COWP  data  were 
unknown.  They  were  also  unclear  in  the  sense  that  war  termination  literature  has 
identified  several  factors,  some  of  which  were  included  in  the  COWP  data,  as  directly 
related  to  the  outcome  of  a  conflict,  but  the  extent  to  which  such  factors  were  statistically 
relevant  had  not  previously  been  determined. 

As  noted  in  the  previous  chapter,  significance  of  a  covariate  in  logistic  regression 
is  identified  by  the  likelihood  ratio  test.  Thus,  the  most  significant  covariate  is  the  one 
with  the  largest  likelihood  ratio  statistic,  G  (Hosmer  and  Lemeshow,  2000: 116).  The 
stepwise  procedure  begins  with  a  pool  of  k  covariates.  The  covariates  can  be  either 
categorical  or  continuous,  but  because  the  covariates  for  this  research  are  continuous,  the 
notation  presented  here  reflects  that  used  for  continuous  covariates  only.  Stepwise 
regression  for  logistic  models  is  described  here  as  a  series  of  steps. 

Step  0:  Fit  a  constant  only  model.  Let  be  the  log-likelihood  value  for  the 
constant  only  model.  Estimate  k  univariate  logistic  regression  models,  one  for  each 
covariate  in  the  pool.  Let  be  the  log-likelihood  value  for  the  model  containing  the 

f"  covariate  in  Step  0,  where  j  =  \,2,...,k .  Using  equation  (2.3 1),  the  likelihood  ratio 
test  is  expressed  as 
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(3.7) 


Let  the  p-value  for  the  f  likelihood  ratio  statistie  be 


ixi,>(^r)=pT- 


(3.8) 


Sinee  the  most  signifioant  eovariate  is  that  with  the  largest  likelihood  ratio  statistie,  then 
the  eovariate  with  the  smallest  p-value  yields  the  same  eonelusion.  Let 

.  (3-9) 


where  denotes  the  p-value  eorresponding  to  the  eovariate  seleeted  to  enter  the  model 

at  Step  1,  provided  that  the  value  does  not  equal  or  exeeed  the  p-value  eorresponding  to  a 
previously  defined  signifieanee  level  (Hosmer  and  Lemeshow,  2000:117).  Let  pj,  be  the 

p-value  for  entry  sueh  that  p^^^  <  Pe-  If  P^e^  -  Pe  ’  proeedure  beeause  no 

eovariates  enter  the  model.  Otherwise,  let  denote  the  eovariate  eorresponding  to  the 
minimum  p-value,  p^°^ ,  and  go  to  Step  1  (Hosmer  and  Lemeshow,  2000:1 18). 

Step  1:  Estimate  the  logistie  regression  model  eontaining  ,  and  let  be  the 
resulting  log-likelihood  of  the  model.  Estimate  k-\  models  that  eontain  both  andx^ , 
where  j  =  \, 2,...,  k  and  j  Eor  eaeh  of  these  k-\  models,  let  denote  its  log- 
likelihood  value.  The  likelihood  ratio  statistie  beeomes 


(3.10) 


and  its  p-value  is  denoted  by  p^^ .  Eet  the  eovariate  eorresponding  to  the  smallest  p- 
value  be  denoted  by  ,  where  the  smallest  p-value  is  determined  by 
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(3.11) 


If  <  Pj, ,  then  add  to  the  model  and  go  to  Step  2.  Otherwise,  end  the  procedure. 

Step  2:  This  step  includes  a  provision  for  backward  elimination.  The 
incorporation  of  a  backward  elimination  check  within  what  would  normally  be  classified 
as  a  forward  selection  method  gives  the  stepwise  logistic  regression  procedure  its  name. 
For  this  step,  the  backward  elimination  check  examines  the  possibility  that  once  is 

added  to  the  model,  may  no  longer  be  significant.  First,  estimate  a  model  containing 
both  andx„  ,  and  let  denote  the  log-likelihood  of  this  model.  Let  denote  the 

e2  ,e’2  cy 


log-likelihood  of  a  model  that  does  not  contain  x^  ,  where  y  =  1, 2  .  The  likelihood  ratio 


test  statistic  is  now  expressed  as 


(3.12) 


Before  deciding  if  a  covariate  should  be  removed  from  the  model,  a  p-value  for  removal 
is  defined,  denoted  .  This  p-value  must  be  assigned  such  that  >  p^  so  that  the 

stepwise  procedure  does  not  admit  and  expel  the  same  covariate  in  consecutive  steps. 
Converse  to  the  task  of  admitting  a  covariate,  the  decision  to  remove  a  covariate  from  the 
model  is  made  by  identifying  the  largest  p-value  computed  from  the  results  of  equation 
(3.12).  This  p-value  is  computed  as 


(2)  /  (2)  (2) 

PrJ 


(3.13) 


and  the  covariate  associated  with  ^  is  denoted  by  x^  .  If  p).'"’  >  ,  then  x^  is  removed 


(2) 


from  the  model.  Otherwise,  x^^  remains  in  the  model,  and  Step  2  continues  with  the 
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forward  selection  phase.  Now,  estimate  k-2  models,  each  containing  ,  and  Xj , 

where  j  =  \,2,...,k  and  7  gj ,  gj .  Compute  the  log-likelihood  for  each  of  the  k-2 
models,  and  let  x^^  denote  the  covariate  associated  with  the  smallest  p-value,  where 

pf\ ) .  (3.14) 

If  p^^^  <  Pi, ,  then  Xg^  add  to  the  model  and  go  to  Step  3.  Otherwise,  end  the  procedure. 

Step  3:  The  computations,  model  entry  checks,  and  model  removal  checks  are 
virtually  the  same  as  those  of  Step  2.  The  full  model  is  estimated,  using  all  of  the 
covariates  entered  from  previous  steps.  Reduced  models  are  then  fit  by  deleting  each  of 
the  covariates  from  the  full  model,  one  at  a  time,  with  replacement.  For  example,  if  the 
k'*  reduced  model  is  estimated  by  deleting  the  i‘^  covariate  from  the  full  model,  then  the 
[k  + reduced  model  is  estimated  by  deleting  the  (/  + 1)'^ ,  or  (/  - ,  covariate  from 

the  full  model,  but  including  the  covariate.  Log-likelihood  values  are  computed  for 
the  full  and  reduced  models,  and  likelihood  ratio  statistics  comparing  the  full  model  to 
each  of  the  reduced  model  are  computed.  The  p-values  corresponding  to  the  likelihood 
ratio  statistics  are  examined  for  both  the  backward  elimination  and  forward  selection 
phases.  If  the  maximum  p-value  is  greater  than  p^^ ,  then  the  covariate  corresponding  to 

the  maximum  p-value  is  expelled  from  the  model.  Otherwise,  the  covariate 
corresponding  to  the  maximum  p-value  is  retained.  If  the  minimum  p-value  is  smaller 
than  Pj, ,  then  the  covariate  corresponding  to  the  minimum  p-value  is  added  to  the  model. 

Otherwise,  the  stepwise  procedure  ends. 

Step  3  is  repeated  until  one  of  two  situations  exist:  either  all  k  covariates  have 
been  added  to  the  model,  or  all  covariates  in  the  model  have  p-values  which  are  smaller 
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than  pj^ .  In  the  latter  situation,  it  must  also  be  the  case  that  all  covariates  not  included  in 


the  model  have  p-values  greater  than  . 


Summary 

This  chapter  described  the  methodology  used  in  this  study.  In  addition  to  the 
logistic  regression  techniques  presented  in  the  previous  chapter,  the  methods  of  data  and 
variable  manipulation  were  presented  in  detail  in  this  chapter.  All  assumptions  made 
about  the  data,  as  well  as  scaling  and  covariate  selection  techniques,  were  also  presented. 
Chapter  IV  presents  the  results  from  the  analyses  conducted. 
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IV.  Results  and  Analysis 


Stepwise  Regression 

Because  it  contained  the  smallest  sample  size  out  of  the  three  sets  examined,  the 
extra-state  wars  set  was  analyzed  first.  The  analysis  began  first  by  dividing  the  data 
between  19**'  and  20**^  Century  observations.  Then,  a  model  constructed  from  all  59 
observations  was  obtained.  Stepwise  regression  was  performed  on  all  cases  for  two 
purposes.  One,  the  stepwise  procedure  fulfdled  its  customary  role  of  identifying  those 
covariates  necessary  to  build  an  adequate  logistic  regression  model  on  the  response. 
Two,  stepwise  regression  provided  an  adequate  test  for  the  rule  of  10  described  in  the 
previous  chapter.  The  results  from  stepwise  regression  are  presented  first. 

Extra-State  Wars  119^**  Century). 

Hosmer  and  Lemeshow  state  that  results  from  previous  research  on  stepwise 
regression  significance  levels  indicate  that  selecting  and  p^  from  the  closed  interval 

[0.15,0.20]  yields  the  best  results  (Hosmer  and  Lemeshow,  2000:1 18).  In  addition, 
Hosmer’s  and  Lemeshow’s  selections  of  p^  and  p^^  for  an  example  experiment  heavily 
influenced  the  entry  and  removal  p-values  selected  for  this  research  (Hosmer  and 
Lemeshow,  2000:121).  Using  the  values  of  />^=0.15  and  j!?^=0.2  in  MINITAB,  the 
output  of  the  analysis  is  shown  in  Figure  1 
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Figure  1:  Stepwise  Results  for  19th  Ceutury  Extra-Systemic  Wars 


The  fact  that  none  of  the  covariates  entered  the  model  implied  that  each  of  the 
four  p-values,  corresponding  to  the  likelihood  ratio  test  statistic,  was  larger  than  .  The 

quantity  for  could  have  been  iteratively  increased  until  at  least  one  covariate  entered 
the  model.  However,  increasing  pj,  would  have  inflated  the  risk  of  allowing 
insignificant  covariates  to  enter  the  model.  This  risk  was  already  present,  given  that  p^ 
was  already  larger  than  the  overall  significance  level  of  a  =  0.05 ,  but  p^  =  0.\5  was 

large  enough  such  that  a  likelihood  ratio  test  for  an  initial  model  would  be  significant  at 
the  0.05  level.  This  was  confirmed  by  fitting  four  univariate  models  in  MINITAB  and 
obtaining  p-values  for  each  likelihood  ratio  test.  The  MINITAB  outputs  for  the  four 
models  are  given  in  Figure  2  through  Figure  5.  The  last  p-value  given  for  each  model 
was  the  value  in  question. 
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Binary  Logistic  Regression:  Winner_ES_UNS_19  versus  Dur_ES_UNS_19 


Variable 

Value 

Count 

Winner  ES  UNS 

_19  1 

27  ( 

2 

8 

Total 

35 

Logistic  Regression  Table 

Predictor 

Coef 

SE  Coef 

Constant 

1.25577 

0.437887 

Dur  ES  UNS  19 

0.144370 

0.583536 

Log-Likelihood  =  -18.782 

Test  that  all  slopes  are  zero:  G  = 


(Event) 


Odds 

95% 

Cl 

Z 

P 

Ratio 

Lower 

Upper 

2.87 

0.004 

0.25 

0.805 

1.16 

0.37 

3.63 

0.064, 

DF  = 

1,  P-Value  =  0 

.800 

Figure  2:  Univariate  Logit  Model  (War  Duration  is  Covariate) 


The  sample  model  with  duration  as  its  eovariate  had  a  0.8  likelihood  ratio  p-value. 
Beeause  0.8  >  0.05  ,  the  null  hypothesis  that  all  model  eoeffieients  are  zero  was  not 
rejeeted.  Thus,  duration  was  not  suffieient  to  explain  the  winner  of  a  19**^  Century  extra- 
systemie  war. 


Binary  Logistic  Regression:  Winner_ES_UNS_19  versus  C_Dths/Pop_ES_UNS_19 


Variable 

Value 

Count 

Winner  ES 

UNS  19  1 

27 

(Event) 

2 

8 

Total 

35 

Logistic 

Regression  Table 

Predictor 

Coef 

SE  Coef 

Constant 

1 

.21918 

0.403617 

C_Dths/Pop_ES_UNS_19  -0.132086  0.408161 

Log-Likelihood  =  -18.764 

Test  that  all  slopes  are  zero:  G  =  0.099, 


Odds 

95% 

Cl 

Z 

P 

Ratio 

Lower 

Upper 

3.02 

0.003 

-0.32 

0.746 

0.88 

0.39 

1 . 95 

DF  =  1,  P-Value  =  0.753 


Figure  3:  Univariate  Logit  Model  (Proportion  of  State  Population  Killed  is  Covariate) 
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The  sample  model  with  the  number  of  state  eombat  deaths  as  a  proportion  of  its 
population  as  the  eovariate  had  a  0.753  likelihood  ratio  p-value.  Beeause  0.753  >  0.05  , 
the  null  hypothesis  that  all  model  eoeffieients  are  zero  was  not  rejeeted.  Thus,  the 
number  of  state  eombat  deaths  as  a  proportion  of  its  population  was  not  suffieient  to 
explain  the  winner  of  a  19**'  Century  extra-systemie  war. 


Binary  Logistic  Regression:  Winner_ES_UNS_19  versus  C_Dths/Arm_ES_UNS_19 

Variable  Value  Count 

Winner_ES_UNS_19  1  27  (Event) 

2  8 

Total  35 

Logistic  Regression  Table 


Odds 

95% 

Cl 

Predictor 

Coef 

SE  Coef 

Z  P 

Ratio 

Lower 

Upper 

Constant 

1.21314 

0.403845 

3.00  0.003 

C  Dths/Arm  ES 

UNS_19  -0, 

,  0552200 

0.602345  - 

■0.09  0.927 

0.95 

0.29 

3.08 

Log-Likelihood 

.  =  -18.810 

Test  that  all 

slopes  are 

zero:  G 

=  0.008,  DF 

=  1,  P-Value 

=  0.928 

Figure  4:  Univariate  Logit  Model  (Proportion  of  State’s  Military  Killed  is  Covariate) 


The  sample  model  with  the  number  of  state  eombat  deaths  as  a  proportion  of  its 
armed  foree  size  as  the  eovariate  had  a  0.928  likelihood  ratio  p-value.  Beeause 
0.928  □  0.05  ,  the  null  hypothesis  that  all  model  eoeffieients  are  zero  was  not  rejeeted. 
Thus,  the  number  of  state  eombat  deaths  as  a  proportion  of  its  armed  foree  size  was  not 
suffieient  to  explain  the  winner  of  a  19**'  Century  extra-systemie  war. 
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Binary  Logistic  Regression: 

Winner_ 

ES_UNS_19  versus  C_Dths/TDths 

_ES_ 

Variable  Value 

Count 

Winner  ES  UNS  19  1 

27 

(Event) 

2 

8 

Total 

35 

Logistic  Regression  Table 

Odds 

95% 

Cl 

Predictor 

Coef 

SE  Coef  Z 

P 

Ratio 

Lower 

Upper 

Constant 

1.24049 

0.434328  2.86 

0.004 

C  Dths/TDths  ES  UNS  19  0 

.559073 

1.43888  0.39 

0.698 

1.75 

0. 

10 

29.35 

Log-Likelihood  =  -18.537 

Test  that  all  slopes  are 

zero:  G 

=  0 . 555,  DF  =  1, 

P-Value 

=  0. 

456 

Figure  5:  Univariate  Logit  Model  (State’s  Proportion  of  Total  Deaths  is  Covariate) 


The  sample  model  with  the  proportion  of  total  combat  deaths  sustained  by  the 
participant  as  the  covariate  had  a  0.456  likelihood  ratio  p-value.  Because  0.456  >  0.05 , 
the  null  hypothesis  that  all  model  coefficients  are  zero  was  not  rejected.  Thus,  the 
proportion  of  total  combat  deaths  sustained  by  the  participant  was  not  sufficient  to 
explain  the  winner  of  a  19**'  Century  extra-systemic  war. 

The  results  clearly  showed  that  each  of  the  p-values  was  larger  than  =  0.15  ,  so 
the  results  from  the  stepwise  procedure  were  confirmed.  The  focus  then  shifted  to 
implications  from  the  rule  of  10.  Out  of  n  =  35  observations,  n^=21 ,  =  8  ,  and 

m  =  min  (hj  ,  )  =  8  .  Therefore,  the  resulting  model  should  contain 
k  +  \<  (m/lO)  =  0.8  «  0  parameters.  The  rule  of  10  proved  effective  in  this  case.  As 

such,  similar  results  were  expected  for  the  20**'  Century  observations. 

While  the  p-value  significance  levels  were  chosen  along  the  closed  interval 
[0.15,0.2] ,  the  significance  of  model  coefficients  was  determined  by  the  p-values 


60 


corresponding  to  their  individual  Wald  statisties.  Eaeh  Wald  statistie,  as  eomputed  from 
equation  (2.32),  is  given  under  the  “Z”  eolumn  in  the  MINITAB  outputs.  The  values 
under  the  “P”  eolumn  are  the  p-values  for  eaeh  Wald  statistie,  whieh  should  be  smaller 
than  a  =  0.05  in  order  to  imply  signifioanee.  Beeause  eaeh  of  these  p-values  for  eaeh  of 
the  eo variates  in  the  univariate  models  in  Figures  4.1.1-2-4.1.1-5  was  larger  than  0.05, 
the  implication  was  that  not  even  an  adequate  univariate  model  eould  be  fit  using  any  of 
the  available  eovariates  for  19**'  Century  extra-state  wars.  That  is,  a  statistieally 
signilieant  relationship  eould  not  be  established  between  the  winner  of  a  19**'  Century 
extra-systemie  war  and  the  eovariates  seleeted  from  the  COWP  data.  While 
diseouraging,  these  results  gave  additional  support  to  the  results  from  the  rule  of  10. 


Extra-State  Wars  (20^**  Century). 

The  results  from  the  stepwise  proeedure  in  MINITAB  for  20*''  Century  extra-state 
wars  are  shown  in  Figure  6.  The  duration  of  the  war  and  the  proportion  of  the  state’s 
armed  forees  killed  were  seleeted  by  the  stepwise  proeess  for  entry  into  the  model.  The 
p-value  for  duration  in  Step  2  was  very  low,  whieh  implied  that  duration  should  prove 
signilieant  in  any  main-effeets  models  of  the  available  eovariates.  The  p-value  for 
C_Dths/Arm_ES_UNS_20,  however,  was  just  barely  smaller  than  the  p-value  for  entry 
into  the  model.  A  model  eontaining  both  of  these  eovariates  was  estimated.  It  was 
expeeted  that  at  least  one  of  these  eovariates  would  be  signifieant.  The  Wald  statisties  for 
this  initial  model  were  inspeeted  to  determine  if  one  of  the  eovariates  should  be 
eliminated.  This  model  is  diseussed  in  a  later  seetion  of  this  thesis. 


61 


Stepwise  Regression:  Winner_ES_UN  versus  Dur_ES_UNS_2,  C_Dths/Pop_E,  ... 

Alpha-to-Enter :  0.15  Alpha-to-Remove :  0.2 

Response  is  Winner_ES_UNS_20  on  4  predictors,  with  N  =  24 
Step  1  2 


Constant 

1.365 

1.355 

Dur  ES  UNS  20 

0.231 

0.229 

T-Value 

3.06 

3.14 

P-Value 

0.006 

0.005 

C  Dths/Arm  ES  UNS  20 

0.105 

T-Value 

1 .66 

P-Value 

0.112 

Figure  6:  MINITAB  Results  for  20th  Ceutury  Extra-State  Wars 


It  was  also  interesting  to  note  that  while  the  rule  of  10  proved  valid  for  the  19*’^ 
Century  extra-state  wars,  it  did  not  for  the  20**^  Century  data.  From  MINITAB,  n,  =  13 , 

^2  =  1 1 ,  and  m  =  min(l3,l  l)  =  11.  Therefore,  the  resulting  model  should  eontain  no 

more  than  k  +  \<  (m/lO)  =  1 . 1 « 1  parameter.  This  implied  that  the  model  should  eontain 

k  =  Q  eovariates;  that  is,  a  eonstant  only  model.  It  should  be  reiterated,  however,  that  the 
rule  of  10  is  not  absolute. 


Extra-State  Wars  (Aggregated  Data). 

In  addition  to  dividing  up  the  observations  between  those  of  the  19**'  Century  and 
those  of  the  20**^  Century,  stepwise  regression  was  also  performed  on  extra-state  wars 
using  all  n  =  59  observations.  The  results  are  shown  in  Figure  7. 
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Stepwise  Regression 

:  Winner_ES_UN  versus  Duration_ES_,  C_Deaths/Pop, ... 

Alpha - t 0- En te r ; 

0.15  Alpha-to-Remove :  0.2 

Response  is  Winner 

ES  UNS  on  4  predictors,  with  N  =  59 

Step 

1 

Constant 

1.32 

Duration  ES  UNS 

0.161 

T-Value 

2 . 75 

P-Value 

0.008 

Figure  7:  Stepwise  Results  for  Full  Extra-State  Wars  Set 


Here,  the  duration  of  the  war  was  the  only  eovariate  signifieant  enough  to  be  ineluded  in 
the  model  at  the  settings  used  for  this  study.  Furthermore,  its  p-value  for  the  likelihood 
ratio  test  was  again  very  small.  It  was  expected  that  Duration_ES_UNS  would 
demonstrate  high  significance  in  the  estimated  univariate  model  for  predicting  the  winner 
of  an  extra-systemic  war,  which  is  discussed  in  a  later  section. 

The  rule  of  10  provided  a  nearly  accurate  assessment  in  this  case.  Out  of  59 
observations,  m  =  mm{ny,n2)  =  min (40, 19)  =  19  was  the  minimum  frequency,  so  the 

model  should  contain  no  more  than  A:  + 1  <  (19/10)  =  1 .9  «  1  parameter.  However,  this 

result  is  so  close  to  2  that  a  univariate  model,  with  duration  as  the  independent  variable, 
was  believed  to  be  adequate. 
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Intrastate  Wars  (19^**  Century). 


The  results  from  the  stepwise  regression  procedure  on  19**'  Century  intrastate  wars 
are  given  in  Figure  8.  As  with  the  19**'  Century  extra-state  wars,  this  implied  that  none  of 
the  four  univariate  models  estimated  possessed  likelihood  ratio  p-values  smaller  than 
0.15.  Four  univariate  models,  one  for  each  of  the  covariates,  were  fit  to  show  these  large 
p-values.  Figure  9  through  Figure  12  give  the  MINITAB  outputs  for  each  of  these  four 
models,  and  the  likelihood  ratio  p-values  can  be  seen  at  the  bottom  of  each  figure.  These 
figures  showed  that  none  of  the  covariates  from  the  COWP  data  could  be  used  to  form  a 
model  predicting  the  winner  of  a  19**'  Century  intrastate  war. 


Stepwise  Regression:  Winner_iS_UN  versus  Duration_iS_,  Dead/Pop_iS 

Alpha-to-Enter :  0.15  Alpha-to-Remove :  0.2 

Response  is  Winner_IS_UNS_l 9  on  4  predictors,  with  N  =  30 

No  variables  entered  or  removed 

Figure  8:  Stepwise  Results  for  19th  Ceutury  lutrastate  Wars 


Binary  Logistic  Regression:  Winner 

JS_UNS_ 

19  versus  Duration_iS_ 

UNS_19 

Variable  Value 

Count 

Winner  IS  UNS  19  1 

23 

(Event) 

2 

7 

Total 

30 

Logistic  Regression  Table 

Odds 

95%  Cl 

Predictor  Coef 

SE  Coef 

Z 

P  Ratio 

Lower  Upper 

Constant  1.32 

0.602988 

2.19 

0.029 

Duration  IS  UNS  19  0.249261 

0.765408 

0.33 

0.745  1.28 

0.29  5.75 

Log-Likelihood  =  -16.241 

Test  that  all  slopes  are  zero: 

G  = 

0.114, 

DF  =  1, 

P-Value  =  0.735 

Figure  9:  Uuivariate  Logit  Model  (Duratiou  is  Covariate) 
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The  univariate  model  eontaining  the  duration  of  the  confliet  as  the  sole  predietor 
of  the  winner  possessed  a  0.735  likelihood  ratio  p-value.  Beeause  0.735  >  0.05  ,  the  null 
hypothesis  that  all  model  coefficients  are  zero  was  not  rejected.  Thus,  duration  was  not 
sufficient  to  explain  the  winner  of  a  19*’^  Century  intrastate  war. 


Binary  Logistic  Regression:  Winner 

JS_UNS_ 

19  versus  Dead/Pop_IS_UNS_19 

Variable 

Value 

Count 

Winner  IS  UNS  19 

1 

23 

(Event) 

2 

7 

Total 

30 

Odds  95% 

Cl 

Predictor 

Coef 

SE  Coef 

Z 

P  Ratio  Lower 

Upper 

Constant 

1 . 98819 

1 . 61659 

1.23 

0.219 

Dead/Pop_IS_UNS_19 

3.63592 

6.48808 

0.56 

0.575  37.94  0 

12642267 

Log-Likelihood  = 

-15.819 

Test  that  all  slopes 

are  zero: 

G  =  0 

.  958, 

DF  =  1,  P-Value  = 

0.328 

Figure  10:  Univariate  Logit  Model  (Deaths  per  Population) 


The  univariate  model  containing  the  number  of  state  combat  deaths  as  a 
proportion  of  its  population  as  the  sole  predictor  of  the  winner  possessed  a  0.328 
likelihood  ratio  p-value.  Because  0.328  >  0.05  ,  the  null  hypothesis  that  all  model 
coefficients  are  zero  was  not  rejected.  Thus,  the  proportion  of  state  combat  deaths  to  its 
population  was  not  sufficient  to  explain  the  winner  of  a  19**'  Century  intrastate  war. 
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Binary  Logistic  Regression:  Winner 

JS_UNS_ 

19  versus  Dead/Arm_iS. 

_UNS_19 

Variable 

Value 

Count 

Winner  IS  UNS  19 

1 

23 

(Event) 

2 

7 

Total 

30 

Odds 

95%  Cl 

Predictor 

Coef 

SE  Coef 

Z 

P  Ratio 

Lower  Upper 

Constant 

2 .16014 

2.04746 

1.06 

0.291 

Dead/Arm  IS  UNS  19 

3.09775 

6.13261 

0.51 

0.613  22.15 

0  3677249 

Log-Likelihood  = 

-16.029 

Test  that  all  slopes 

are  zero: 

G  =  0 

.  539, 

DF  =  1,  P- 

-Value  =  0.463 

Figure  11:  Univariate  Logit  Model  (Deaths  per  Total  Armed  Forces) 


The  univariate  model  containing  the  number  of  state  combat  deaths  as  a 
proportion  of  its  armed  force  size  as  the  sole  predictor  of  the  winner  possessed  a  0.463 
likelihood  ratio  p-value.  Because  0.463  >  0.05  ,  the  null  hypothesis  that  all  model 
coefficients  are  zero  was  not  rejected.  Thus,  the  proportion  of  state  combat  deaths  to  its 
armed  force  size  was  not  sufficient  to  explain  the  winner  of  a  19*’^  Century  intrastate  war. 


Binary  Logistic  Regression:  Winner_iS_ 

UNS_19  versus  C_Dead/TotDead_ 

iS_UNS_19 

Variable  Value 

Count 

Winner  IS  UNS  19  1 

23 

(Event) 

2 

7 

Total 

30 

Odds 

95%  Cl 

Predictor  Coef 

SE  Coef 

Z 

P  Ratio 

Lower  Upper 

Constant  1.39879 

0.508392 

2.75 

0.006 

C_Dead/TotDead_IS_UNS_19  -0 .40011 

0.398705 

-1 

0.316  0.67 

0.31  1.46 

Log-Likelihood  =  -15.786 

Test  that  all  slopes  are  zero: 

G  = 

1.025, 

DF  =  1,  P 

-Value  =  0.31 

Figure  12:  Univariate  Logit  Model  (Proportion  of  Total  Casualties) 
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The  univariate  model  containing  the  proportion  of  total  combat  deaths  sustained 
by  the  participant  as  the  sole  predictor  of  the  winner  possessed  a  0.31  likelihood  ratio  p- 
value.  Because  0.31  >  0.05 ,  the  null  hypothesis  that  all  model  coefficients  are  zero  was 
not  rejected.  Thus,  the  proportion  of  total  combat  deaths  sustained  by  the  participant  was 
not  sufficient  to  explain  the  winner  of  a  19**'  Century  intrastate  war. 

As  expected,  each  of  the  likelihood  ratio  p-values  for  each  of  the  above  models 
was  larger  than  0.15.  The  above  figures  also  demonstrated  that  not  even  a  statistically 
significant  univariate  model  could  be  estimated  for  the  19**'  Century  intrastate  wars,  as 
seen  from  the  Wald  statistic  p-values  being  each  much  larger  than  the  selected  a 
significance  level,  0.05. 

In  this  case,  the  rule  of  10  resisted  scrutiny.  That  is,  the  results  from  the  rule  of 
10  followed  those  obtained  by  stepwise  regression.  A  model  for  the  19**'  Century 
intrastate  wars  should  contain  no  more  than  (m/lO)  =  0.7  «  0  parameters, 

wherem  =  mva.{n^,n2)  =  min(23,7)  =  7  .  Since  each  of  the  Wald  statistic  p-values  was 

larger  than  0.05,  each  of  which  failed  to  reject  the  null  hypothesis  of  equation  (2.29),  no 
final  model  for  the  19*  Century  intrastate  war  data  was  estimated.  That  is,  a  statistically 
significant  relationship  could  not  be  established  between  the  winner  of  a  19*  Century 
intrastate  war  and  the  covariates  derived  from  the  COWP  data. 
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Intrastate  Wars  (20^**  Century). 

A  total  of  n  =  43  observations  were  available  for  analysis  of  20**'  Century 
intrastate  wars.  The  stepwise  regression  proeedure  suggested  using  three  eovariates  in 
the  model.  The  MINITAB  output  is  shown  in  Figure  13.  A  binary  logistie  regression 
model,  eontaining  the  three  eovariates  suggested  by  the  stepwise  proeedure,  was 
estimated.  The  resulting  parameter  estimates,  goodness-of-fit,  and  diagnostie  measures 
are  examined  in  a  later  seetion.  The  values  for  these  measures  influeneed  the  substanee 
of  the  final  model. 


Stepwise  Regression:  Winner_IS 

_UN  versus  DurationJS. 

,  Dead/Pop_IS_, ... 

Alpha-to-Enter :  0.15 

Alpha-to-Remove :  0.2 

Response  is  Winner  IS  UNS  20 

on  4  predictors,  with 

N  =  43 

Step 

1 

2 

3 

Constant 

1.317 

1.322 

1.334 

Duration  IS  UNS  20 

0.225 

0.262 

0.246 

T-Value 

3.56 

3.94 

3.77 

P-Value 

0.001 

0 

0.001 

Dead/Arm  IS  UNS  20 

-0.088 

-0.214 

T-Value 

-1.56 

-2.5 

P-Value 

0 .127 

0 .017 

Dead/ Pop_I S_UNS_2  0 

0.162 

T-Value 

1.92 

P-Value 

0.062 

Figure  13:  Stepwise  Regression  (20th  Century  Intrastate  Wars) 
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It  was  also  interesting  to  note  that  the  stepwise  results  contradicted  the  rule  of  10. 
Given  that  =  26  and  =  17 ,  the  rule  of  10  indicated  that  the  model  should  only 

contain  up  to  (m/lO)  =  (17/10)  =  1.7  «  1  parameter.  If  this  result  was  accurate,  then  the 
stepwise  regression  should  not  have  allowed  any  covariates  to  enter  the  model. 


Intrastate  Wars  (Aggregated). 

Stepwise  regression  was  again  conducted  on  aggregated  data,  this  time  for  the 
n  =  73  observations  in  the  entire  intrastate  wars  data  set.  It  was  interesting  to  confirm 
that  the  same  single  covariate,  duration,  was  entered  into  the  model  for  both  this  case  and 
for  the  aggregated  extra-state  wars  case.  The  results  are  shown  in  Figure  14.  A 
possibility  considered  here  was  that  a  general  relationship  between  the  duration  of  both 
intrastate  and  extra-state  wars  and  the  winners  of  both  may  exist.  The  extent  of  such  a 
relationship  was  examined  after  fitting  the  final  models  for  both  types  of  wars. 


Stepwise  Regression:  Winner_ 

IntS  versus  C_Dead/TotDe,  Dead/Arm_... 

Alpha-to-Enter : 

0.15  Alpha-to-Remove :  0.2 

Response  is  Winner  IntS  on 

4  predictors,  with  N  =  73 

Step 

1 

Constant 

1.329 

Duration  IntS  UNS 

0.176 

T-Value 

3.38 

P-Value 

0.001 

Figure  14:  Stepwise  Results  (Aggregated  lutrastate  Wars) 
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In  this  case,  the  stepwise  results  eorrespond  to  those  of  the  rule  of  10.  Given  that 


n,  =  49  and  ^  =  24 ,  the  rule  of  10  suggests  that  the  model  ean  eontain  up  to  2 

parameters.  Thus,  it  was  expeeted  that  the  univariate  model  using  the  duration  of  the 
eonfliet  to  prediet  the  winner  of  an  intrastate  war  would  exhibit  an  adequate  fit,  and  the 
length  of  the  war  would  show  statistieal  signifieanee  through  its  Wald  p-value. 


Interstate  Wars  119^**  Century). 

Using  the  values  of  =  0.15  and  =  0.2 ,  no  eovariates  were  entered  into  the 

model  from  the  stepwise  proeedure,  shown  in  Figure  15.  Asa  test,  the  p-values  for  entry 
and  removal  were  then  ineremented  by  0.01  to  determine  just  how  large  the  entry  p-value 
needed  to  be  in  order  to  admit  even  one  eovariate.  It  was  found  that  the  p-value  for  entry 
needed  to  be  at  least  0.31,  and  only  the  eovariate  Dths/TDths_UNS_19  was  admitted 
at  =  0.3 1 ,  whieh  is  given  in  Figure  16.  This  result  gave  both  additional  support  to  the 

validity  of  the  stepwise  proeedure  and  justifieation  to  the  default  level  of  Pj, .  In  general, 
the  level  neeessary  to  inelude  even  one  eovariate  in  a  multinomial  model  for 

predieting  the  outeome  of  an  interstate  war  was  too  large  to  suggest  that  the  resulting 
model  eorreetly  deseribed  the  relationship  between  the  outcome  of  an  interstate  war  and 
the  single  predietor  variable. 
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Stepwise  Regression:  Outcome(PR2)  versus  MilEx/TT_UNS, 
Duration_UNS,.-- 

Alpha-to-Enter :  0.15  Alpha-to-Remove :  0.2 

Response  is  Outcome (PR2)_1 9  on  5  predictors,  with  N=58 
No  variables  entered  or  removed. 

Figure  15:  Stepwise  Results  (Default  Eutry  P-Value) 


Stepwise  Regression:  Outcome(PR2)  versus  MilEx/TT_UNS, 
Duration_UNS,... 

Alpha-to-Enter:  0.31  Alpha-to-Remove:  0.35 

Response  is  Outcome (PR2)_1 9  on  5  predictors,  with  N=58 


Step  1 
Constant  2.169 

Dths/TDeaths_UNS_19  0.2 
T-Value  1.04 
P-Value  0.304 


Figure  16:  Stepwise  Results  (lucremeuted  Eutry  P-Value) 


It  was  found  that  the  rule  of  10  was  again  applieable  in  this  ease.  The  five 
outeome  eategory  frequeneies  were  n^=25  ,  Uj  =  16 ,  =  2,  n^-\Q ,  and  n^=5.  Given 

that  the  smallest  frequeney  was  2,  the  rule  of  10  indieated  that  the  model  eontain  0 
parameters.  That  is,  a  eorreetly  speeified  univariate  multinomial  model  for  predieting  the 
outeome  of  a  19**'  Century  interstate  war  eould  not  be  obtained  using  any  of  the  covariates 
derived  from  the  COWP  data.  The  observations  from  the  two  outcomes  with  the  lowest 
frequencies,  2  and  5,  could  have  been  eliminated  and  the  stepwise  procedure  repeated. 
However,  the  rule  of  10  would  have  then  suggested  at  most  a  constant  only  model.  The 
only  way  to  have  had  the  rule  of  10  results  reflect  a  model  with  at  least  2  parameters;  that 
is,  a  covariate  and  intercept,  was  to  eliminate  all  19*  Century  interstate  war  observations 
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except  those  corresponding  to  the  first  outcome.  The  problem  would  have  then  ceased  to 
be  a  logistic  regression  problem,  since  a  logistic  regression  problem  requires  a  response 
variable  with  at  least  two  categories.  These  limitations  were  only  present  in  the  COWP 
data.  The  COWP  data  on  interstate  wars  contained  too  many  missing  entries,  and  the 
complete  data  were  skewed  in  favor  of  victory  by  military  imposition.  Therefore,  no 
further  elimination  of  observations  was  performed,  and  no  model  for  the  19**'  Century 
interstate  wars  was  estimated. 

Just  as  with  binomial  outcomes,  the  likelihood  ratio  test  is  also  the  basis  of 
comparison  in  stepwise  regression  for  multinomial  outcomes.  Univariate  multinomial 
models  were  estimated,  one  for  each  of  the  five  available  covariates,  and  the  MINITAB 
outputs  for  each  model  are  shown  in  Figure  17  through  Figure  21.  The  likelihood  ratio 
statistic  and  its  p-value  is  shown  at  the  bottom  of  each  figure.  The  purpose  of  estimating 
these  models  was  to  confirm  the  results  from  the  stepwise  procedure. 
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Nominal  Logistic  Regression:  Outcome(PR2)_19  versus  Dths/Pop_UNS_ 

.19 

Response  Information 

Variable 

Value 

Count 

Outcome (PR2)  19 

1 

25  (Reference 

Event ) 

5 

5 

4 

10 

3 

2 

2 

16 

Total 

58 

Logistic  Regression 

Table 

Odds 

95%  Cl 

Predictor  Coef 

SE  Coef 

Z 

P  Ratio 

Lower  Upper 

Logit  1:  (5/1) 

Constant 

-1 . 54404 

0.841252 

-1 . 84 

0.066 

Dths/Pop  UNS  19 

0.292128 

3.08618 

0.09 

0.925  1.34 

0  567.39 

Logit  2:  (4/1) 

Constant 

-1.32853 

0.904292 

-1 . 47 

0.142 

Dths/Pop  UNS  19 

-1.69837 

3.28726 

-0.52 

0.605  0.18 

0  114.97 

Logit  3:  (3/1) 

Constant 

-8.30628 

10.1307 

-0.82 

0.412 

Dths/Pop  UNS  19 

-20.4914 

34.2286 

-0.6 

0.549  0 

0  1.72E+20 

Logit  4;  (2/1) 

Constant 

-0.702933 

0.679463 

-1.03 

0.301 

Dths/Pop  UNS  19 

-1 . 07874 

2.48216 

-0.43 

0.664  0.34 

0  44.09 

Log-Likelihood  = 

-77.503 

Test  that  all  slopes 

are  zero: 

;  G  =  1. 

.419,  DF  =  4,  P 

-Value  =  0.841 

Figure  17 :  Univariate  Multinomial  Model  (Deaths/Population) 


The  univariate  multinomial  model  containing  the  proportion  of  participant  combat 
deaths  to  its  population  exhibited  a  0.841  likelihood  ratio  p-value.  Because  0.841  >  0.05 , 
the  null  hypothesis  that  the  coefficients  in  each  logit  are  zero  was  not  rejected.  Thus,  the 
proportion  of  participant  combat  deaths  to  its  population  was  not  sufficient  to  predict  the 
outcome  of  a  19**'  Century  interstate  war. 


73 


Nominal  Logistic  Regression:  Outcome(PR2)_19  versus  Dths/Arm_UNS_19 


Response  Information 


Variable 

Value 

Count 

Outcome (PR2)  19 

1 

25 

(Reference  Event) 

5 

5 

4 

10 

3 

2 

2 

16 

Total 

58 

Logistic  Regression 

Table 

Odds 

95 

%  Cl 

Predictor 

Coef 

SE  Coef 

Z 

P 

Ratio 

Lower 

Upper 

Logit  1 :  (5/1 ) 

Constant 

-1 . 94081 

1 .12036 

-1.73 

0.083 

Dths/Arm  UNS  19 

-2.11182 

5.94128 

-0.36 

0.722 

0.12 

0 

13809.2 

Logit  2:  (4/1) 

Constant 

-1.51576 

1.38604 

-1.09 

0.274 

Dths/Arm  UNS  19 

-3.6547 

7 . 63922 

-0.48 

0.632 

0.03 

0 

82304 .04 

Logit  3:  (3/1) 

Constant 

-3.99637 

4.71273 

-0.85 

0.396 

Dths/Arm  UNS  19 

-8.52392 

25.738 

-0.33 

0.741 

0 

0 

1 . 61E  +  18 

Logit  4:  (2/1) 

Constant 

-1.23185 

1.43243 

-0.86 

0.39 

Dths/Arm  UNS  19 

-4 .70731 

7 . 939 

-0.59 

0.553 

0.01 

0 

51695.18 

Log-Likelihood  =  -77.314 

Test  that  all  slopes  are  zero:  G  =  1.799,  DF  =  4,  P-Value  =  0.773 

Figure  18:  Univariate  Multinomial  Model  (Deaths/Armed  Forces) 


The  univariate  multinomial  model  eontaining  the  proportion  of  participant  combat 
deaths  to  its  armed  force  size  exhibited  a  0.773  likelihood  ratio  p-value.  Because 
0.773  >  0.05  ,  the  null  hypothesis  that  the  coefficients  in  each  logit  are  zero  was  not 
rejected.  Thus,  the  proportion  of  participant  combat  deaths  to  its  armed  force  size  was 
not  sufficient  to  predict  the  outcome  of  a  19**'  Century  interstate  war. 
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Nominal  Logistic  Regression:  Outcome(PR2)_19  versus  Dths/Tdeaths_UNS_19 


Response  Information 


Variable 

Value 

Count 

Outcome (PR2)  19 

1 

25 

(Reference  Event) 

5 

5 

4 

10 

3 

2 

2 

16 

Total 

58 

Logistic  Regression 

Table 

Odds 

95% 

Cl 

Predictor 

Coef 

SE  Coef 

Z 

P 

Ratio 

Lower 

Upper 

Logit  1:  (5/1) 

Constant 

-1 . 90113 

0.596498 

-3.19 

0.001 

Dths/TDeaths  UNS 

_19 

0.747108 

0.51968 

1 . 44 

0.151 

2 . 11 

0.76 

5.85 

Logit  2:  (4/1) 

Constant 

-0 . 91742 

0.375327 

-2 . 44 

0.015 

Dths/TDeaths  UNS 

_19 

0.0167269 

0.417589 

0.04 

0.968 

1.02 

0.45 

2.31 

Logit  3:  (3/1) 

Constant 

-2.80629 

0.902243 

-3.11 

0.002 

Dths/TDeaths  UNS 

19 

0.731972 

0.759506 

0.96 

0.335 

2.08 

0.47 

9.21 

Logit  4;  (2/1) 

Constant 

-0.471784 

0.325654 

-1 . 45 

0.147 

Dths/TDeaths  UNS 

19 

0.185769 

0.350366 

0.53 

0.596 

1.2 

0.61 

2.39 

Log-Likelihood  =  -76.755 

Test  that  all  slopes  are  zero:  G  =  2.917,  DF  =  4,  P-Value  =  0.572 

Figure  19:  Univariate  Multinomial  Model  (Proportion  of  Total  Dead) 


The  univariate  multinomial  model  eontaining  the  proportion  of  total  eombat 
deaths  sustained  by  the  partieipant  exhibited  a  0.572  likelihood  ratio  p-value.  Beeause 
0.572  >  0.05  ,  the  null  hypothesis  that  the  eoeffieients  in  eaeh  logit  are  zero  was  not 
rejeeted.  Thus,  the  proportion  of  total  eombat  deaths  sustained  by  the  partieipant  was  not 
suffieient  to  prediet  the  outeome  of  a  19**'  Century  interstate  war. 
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Nominal  Logistic  Regression:  Outcome(PR2)_19  versus  Duration. 

_UNS. 

_19 

Response  Information 

Variable  Value 

Count 

Outcome (PR2)  19  1 

25 

(Reference  Event) 

5 

5 

4 

10 

3 

2 

2 

16 

Total 

58 

Logistic  Regression  Table 

Odds 

95% 

CI 

Predictor  Coef 

SE  Coef 

Z 

P 

Ratio  Lower 

Upper 

Logit  1:  (5/1) 

Constant  -1.75906 

0.595328 

-2.95 

0.003 

Duration  UNS  19  -0.488181 

0.888294 

-0.55 

0.583 

0.61 

0.11 

3.5 

Logit  2:  (4/1) 

Constant  -0.911427 

0.385027 

-2.37 

0.018 

Duration  UNS  19  0.0259329 

0.492012 

0.05 

0.958 

1.03 

0.39 

2.69 

Logit  3:  (3/1) 

Constant  -3.02605 

1.40983 

-2 . 15 

0.032 

Duration  UNS  19  -1.23568 

2.32598 

-0.53 

0.595 

0.29 

0 

27.75 

Logit  4:  (2/1) 

Constant  -0.520875 

0.35039 

-1.49 

0.137 

Duration  UNS  19  -0.28124 

0.485538 

CO 

LT) 

O 

1 

0.562 

0.75 

0.29 

1.96 

Log-Likelihood  =  -77.644 

Test  that  all  slopes  are  zero 

:  G  =  1. 

139, 

DF  =  4 

,  P-Value  = 

0.888 

Figure  20:  Univariate  Multinomial  Model  (War  Duration) 


The  univariate  multinomial  model  eontaining  the  duration  of  a  19**'  Century 
interstate  war  exhibited  a  0.888  likelihood  ratio  p-value.  Beeause  0.888  >  0.05 ,  the  null 
hypothesis  that  the  coeffieients  in  each  logit  are  zero  was  not  rejected.  Thus,  the  duration 
of  a  19**^  Century  interstate  war  was  not  sufficient  to  predict  the  outcome  of  a  19**' 

Century  interstate  war. 
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Nominal  Logistic  Regression:  Outcome(PR2)_19  versus  MilEx/TT 

_UNS_19 

Response  Information 

Variable 

Value 

Count 

Outcome (PR2)  19 

1 

25 

(Reference  Event) 

5 

5 

4 

10 

3 

2 

2 

16 

Total 

58 

Logistic  Regression  Table 

Odds 

95%  Cl 

Predictor 

Coef 

SE  Coef 

P 

Ratio  Lower  Upper 

Logit  1:  (5/1) 

Constant 

25.8416 

33.853 

0.445 

MilEx/TT  UNS  19 

164.655 

229.265 

0.473 

0 

0  4.42E+123 

Logit  2:  (4/1) 

Constant 

20.2078 

22 . 6403 

0.372 

MilEx/TT  UNS  19 

131.193 

153.546 

0.393 

0 

0  5.30E+73 

Logit  3:  (3/1) 

Constant 

5.79748 

11 .8392 

0.624 

MilEx/TT  UNS  19 

57.4596 

82.236 

0.485 

9 . OOE+24 

0  9.02E+94 

Logit  4:  (2/1) 

Constant  0 

. 609505 

8.56002 

0.943 

MilEx/TT  UNS  19 

7 .23107 

58.5934 

0.902 

1381.7 

0  1.04E+53 

Log-Likelihood  =  -76, 

.  989 

Test  that  all  slopes 

are  zero 

o 

II 

60 

.447, 

DF  =  4,  P-Value  =  0.654 

Figure  21:  Univariate  Multinomial  Model  (Military  Expenditures/Total  Trade) 


The  univariate  multinomial  model  eontaining  the  average  amount  of  military 
spending  as  a  proportion  of  the  average  total  trade  for  a  19**'  Century  interstate  war 
exhibited  a  0.654  likelihood  ratio  p-value.  Because  0.654  >  0.05 ,  the  null  hypothesis  that 
the  coefficients  in  each  logit  are  zero  was  not  rejected.  Thus,  the  average  amount  of 
military  spending  as  a  proportion  of  the  average  total  trade  for  a  19**'  Century  interstate 
war  was  not  sufficient  to  predict  the  outcome  of  a  19**'  Century  interstate  war. 
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For  Figure  21,  the  Z  eolumn  was  removed  for  two  reasons.  One,  the  values  under 
the  Z  eolumn  were  irrelevant  to  what  was  being  demonstrated.  That  is,  the  likelihood 
ratio  p-value  for  the  model  was  the  quantity  of  interest  in  eaeh  figure.  Two,  if  the  values 
under  the  Z  eolumn  were  needed,  then  they  eould  be  eomputed  direetly  using  equation 
(2.32),  beeause  the  values  labeled  Z  in  MINITAB  are  equivalent  to  the  Wald  statistie,  W. 

It  should  be  noted  that  the  aforementioned  stepwise  seleetion  results  only  applied 
to  the  available  COWP  data.  Investigations  of  the  outeomes  of  19**'  Century  interstate 
wars  using  other  data  sourees  may  yield  different  stepwise  seleetion  results.  It  is 
imperative  that  the  primary  and  seeondary  goals  of  this  study  be  reiterated.  The  findings 
in  this  thesis  appear  only  as  a  consequenee  of  strietly  using  the  COWP  data.  The  purpose 
of  subjeeting  the  COWP  data  on  interstate  wars  to  stepwise  seleetion  was  both  to 
demonstrate  the  applieability  of  logistie  regression  to  war  termination  studies  and  to 
expose  the  limitations  of  using  open-source  data. 


Interstate  Wars  (20^**  Century). 

The  results  from  the  rule  of  10  for  the  n  =  167  observations  on  20**'  Century 
interstate  wars  did  not  match  those  from  the  stepwise  selection,  which  admitted  two 
covariates  to  the  model.  In  this  case,  n^=62,  ,  n^=29  ,  n^=\6,  and  n^=23  . 

With  the  fourth  outcome  having  the  smallest  frequency,  16,  the  rule  of  10  showed  that  the 
model  should  contain  no  more  than  1  parameter.  The  results  from  stepwise  selection  in 
MINITAB,  which  are  given  in  Figure  22,  indicated  otherwise. 
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Stepwise  Regression:  Outcome(PR2)  versus  MiiEx/TT_UNS,  Duration 

_UNS,... 

Alpha-to-Enter :  0.15 

Alpha-to-Remove : 

0.2 

Response  is  Outcome (PR2)  20  on  5  predictors. 

with  N  =  167 

Step 

1 

2 

Constant 

2 . 44 

2 . 452 

Dths/TDeaths  UNS  20 

0.51 

0.48 

T-Value 

4.99 

4 . 67 

P-Value 

0 

0 

Duration  UNS  20 

- 

■0.147 

T-Value 

-1 .51 

P-Value 

0.133 

Figure  22:  Stepwise  Selection  (20th  Century  Interstate  Wars) 


Two  covariates  were  seleeted  for  inelusion  into  the  model:  the  duration  of  the  war 
and  the  proportion  of  total  deaths  borne  by  the  partieipant.  Presented  in  a  later  seetion, 
this  bivariate  model  was  estimated,  and  the  Wald  statisties  were  examined  to  assess  the 
individual  signifieanee  of  eaeh  eovariate  in  the  model.  Onee  the  final  model  was 
established,  then  the  goodness-of-fit  tests  and  diagnostie  measures  were  analyzed. 

The  results  in  Figure  22  provided  a  starting  point  for  eonstrueting  a  multinomial 
predietion  model  for  the  outeome  of  20**^  Century  interstate  wars.  The  two  eovariates, 
duration  and  the  proportion  of  total  deaths  borne  by  the  partieipant,  were  used  to  estimate 
an  initial  model.  It  was  expeeted  that  the  goodness-of-fit  statisties  for  the  initial  model 
would  show  it  to  be  adequate.  It  was  also  expeeted  that  the  likelihood  ratio  p-value  for 
the  initial  model  would  support  the  notion  that  at  least  one  of  the  ineluded  eovariates  was 
signifieant  to  predieting  the  outeome  of  a  20**'  Century  interstate  war.  The  p-values  for 
the  Wald  statisties  suggested  whieh  eovariate,  if  not  both,  was  to  be  retained  in  the  final 
model. 
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Interstate  Wars  (Aggregated). 


Considering  all  n  =  225  observations  in  the  interstate  wars  data  set,  stepwise 
regression  seleeted  only  one  eovariate:  the  proportion  of  total  deaths  borne  by  the 
partieipant,  ov  Deaths/TotDeaths_UNS.  Figure  23  shows  the  results  from  MINITAB. 
Additionally,  eomputations  from  the  rule  of  10  supported  the  stepwise  results.  That  is, 
the  smallest  outeome  frequeney  was  26,  so  the  rule  of  10  eoneluded  that  the  model  eould 
eontain  up  to  2  parameters,  making  it  at  most  a  univariate  multinomial  model. 


Stepwise  Regression:  Outcome(PR2)  versus  Deaths/Pop_U,  Deaths/Arm_U,... 

Alpha-to-Enter :  0.15  Alpha-to-Remove :  0.2 

Response  is  Outcome (PR2)  on  5  predictors,  with  N  =  225 

Step  1 

Constant  2.356 


Deaths/TotDeaths_UNS 

T-Value 

P-Value 


0.421 
4 . 67 
0 


Figure  23:  Stepwise  Results  (225  luterstate  Wars) 


Given  that  the  eovariate  eoneeming  easualty  proportions  was  admitted  in  the 
stepwise  results  for  both  the  aggregated  interstate  wars  set  and  the  20**'  Century  interstate 
wars  set,  the  possibility  that  this  eovariate  would  be  highly  signifieant  in  both 
multinomial  models  was  eonsidered.  The  extent  of  this  signifioanee  is  diseussed  in  later 
seetions,  where  tests  on  individual  model  eoeffieients  are  eondueted. 

When  the  model  for  predieting  the  outeome  of  an  interstate  war  was  estimated,  it 
was  expeeted  that  the  goodness-of-fit  tests  would  show  the  model  to  be  eorreetly 
speeified.  Additionally,  the  likelihood  ratio  test  and  Wald  test  was  expeeted  to  indieate 
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the  statistical  significance  of  the  casualty  proportion  covariate  in  predicting  the  outcomes 
of  interstate  wars. 


Binary  Logistic  Regression  Models  on  Winner 

A  total  of  four  binary  logistic  regression  models  were  estimated  and  analyzed. 
Instead  of  six  models,  as  was  postulated  in  the  previous  chapter,  only  four  models  were 
fit  because  in  two  out  of  the  six  possible  cases,  the  stepwise  regression  procedure  did  not 
allow  any  covariates  to  enter  the  model.  This  result  implied  that  in  any  univariate  model 
for  those  cases,  the  resulting  p-value  for  its  likelihood  ratio  test  statistic,  G,  was  larger 
than  the  defined  significance  level  for  this  research,  a  =  0.05 ,  the  fact  that  it  was  also 
larger  than  p^  =  0.\5  notwithstanding.  Two  binary  models  were  estimated  for  the  extra¬ 
state  wars  data:  one  for  the  20**^  Century  observations  and  the  other  for  the  aggregated 
data.  The  same  was  also  done  for  the  intrastate  wars  data.  The  results  for  the  extra-state 
models  are  presented  first. 


20***  Century  Extra-State  Wars  Model. 

The  initial  model  estimated  for  the  20**'  Century  extra-state  wars  data  followed  the 
recommendations  from  the  stepwise  selection  procedure  and  contained  two  covariates: 
Dur_ES_UNS_20  and  C_Dths/Arm_ES_UNS_20.  That  is,  stepwise  regression  considered 
the  length  of  an  extra-state  war  and  the  number  of  state  deaths  as  a  proportion  of  the 
state’s  military  manpower  as  significant  in  predicting  the  winner  of  an  extra-systemic 
war.  The  initial  model  fit  from  MINITAB  is  shown  in  Figure  24. 
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The  initial  model  estimated  for  the  20**'  Century  extra-state  wars  data  eontained 
those  eovariates  identified  by  the  stepwise  proeedure.  Thus,  the  initial  model  was 
bivariate,  eontaining  the  eovariates  eoneerning  the  duration  of  the  eonfliet, 
Dur_ES_UNS_20,  and  the  number  of  state  deaths  as  a  proportion  of  the  state’s  military 
manpower,  C_Dths/Arm_ES_UNS_20 .  Figure  24  shows  the  parameter  estimates,  Wald 
statisties,  odds  ratios,  likelihood  ratio  test,  devianee,  Pearson  ehi-square  test,  and  the 
Hosmer-Lemeshow  test  for  the  initial  model. 


Binary  Logistic  Regression: 

Winner_ES_ 

UN  versus  Dur 

_ES 

_U,  C_Dths/Arm_E 

Response  Information 

Variable 

Value 

Count 

Winner  ES  UNS  20 

1 

13 

(Event) 

2 

11 

Total 

24 

Logistic  Regression  Table 

Odds 

95% 

Cl 

Predictor 

Coef 

SE  Coef 

z 

P  Ratio 

Lower 

Upper 

Constant 

0.478616 

0.717177 

0 

.  67 

0.505 

Dur  ES  UNS  20 

-1.08089 

0 .511024 

-2 

.12 

0.034  0.34 

0.12 

0.92 

C  Dths/Arm  ES  UNS  20 

-1.2966 

2.11985 

-0 

.  61 

0.541  0.27 

0 

17.43 

Log-Likelihood  =  -11.31 

Test  that  all  slopes  are 

zero:  G  = 

10.485, 

DF  = 

2, 

P-Value  =  0 

.005 

Goodness-of-Fit  Tests 

Method 

Chi-Square 

DF 

P 

Pearson 

22 . 5125 

21 

0. 

371 

Deviance 

22 . 6194 

21 

0. 

365 

Hosmer-Lemeshow 

6.0479 

8 

0. 

642 

Figure  24:  MINITAB  Output  (luitial  Model) 
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The  p-value  0.371  for  the  Pearson  ehi-square  statistie  was  larger  than  a  ,  whieh 
implied  that  the  model  using  the  duration  of  a  20**^  Century  extra-systemie  war  and  the 
proportion  of  state  eombat  deaths  to  its  armed  foree  size  to  prediet  the  winner  of  a  20* 
Century  extra-state  war  was  adequately  fit.  Similar  implieations  were  made  from  the  p- 
values  for  the  Deviance  and  Hosmer-Lemeshow  statistics,  which  were  0.365  >  0.05  and 
0.642  >  0.05 ,  respectively. 

Because  the  p-value  for  the  likelihood  ratio  statistic  was  0.005  <a  =  0.05  ,  the 
null  hypothesis  from  equation  (2.29)  was  rejected  in  favor  of  the  alternative  hypothesis, 
Hj  .  That  is,  there  was  sufficient  evidence  to  suggest  that  at  least  one  of  the  model 
coefficients  was  nonzero.  The  Wald  statistics  for  each  of  the  two  covariates  in  the  initial 
model  were  examined  to  determine  which  covariate,  if  not  both,  needed  to  be  retained  in 
the  final  model. 

From  the  discussion  of  Wald  statistics  in  Chapter  2  and  equation  (2.32),  it  follows 
that  a  Wald  statistic  with  a  p-value  smaller  than  a  =  0.05  implies  significance  of  the 
covariate  under  test.  While  the  Wald  p-value  for  Dur_ES_UNS_20  was  0.034,  the  p- 
value  for  C_Dths/Arm_ES_UNS_20  was  0.541,  which  suggested  that  the  number  of  state 
deaths  as  a  proportion  of  the  state’s  military  manpower  was  not  significant  to  the  model 
at  an  a  =  0.05  level.  This  result  implied  that  a  reduced  model  needed  to  be  estimated.  In 
spite  of  the  initial  model  proving  adequate,  a  reduced  model  was  estimated  that  included 
only  the  Dur_ES_UNS_20  covariate.  A  likelihood  ratio  test  was  then  performed  to 
compare  the  two  models.  The  results  of  this  comparison  determined  whether  or  not  the 
reduced  model  was  adequate  enough  to  continue  with  odds  ratio  interpretation  and 
diagnostics. 
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The  logistic  regression  table  along  with  the  likelihood  ratio  test  and  goodness-of- 


fit  tests  for  the  reduced  extra-state  wars  model  is  given  in  Figure  25. 


Binary  Logistic  Regression: 

Winner_ES_ 

UNS_20  versus  Dur_ES. 

_UNS. 

.20 

Response  Information 

Variable 

Value 

Count 

Winner  ES  UNS  20 

1 

13 

(Event) 

2 

11 

Total 

24 

Logistic  Regression  Table 

Odds 

95% 

Cl 

Predictor 

Coef 

SE  Coef 

Z  P 

Ratio 

Lower 

Upper 

Constant 

0.568524 

0.500277 

1.14  0.256 

Dur  ES  UNS  20 

-1 .18758 

0.516323 

-2.3  0.021 

0.3 

0.11 

0.84 

Log-Likelihood  =  -12.575 

Test  that  all  slopes  are 

zero:  G  =  7 

.953,  DF 

=  1,  P-Value 

=  0. 

005 

Goodness-of-Fit  Tests 

Method 

Chi-Square 

DF 

P 

Pearson 

25.3187 

20 

0.19 

Deviance 

25.1509 

20 

0.196 

Hosmer-Lemeshow 

11.8916 

8 

0.156 

Figure  25:  Reduced  Model  Results 


As  with  the  initial  model,  the  p-value  of  the  Wald  statistic  for  Dur_ES_UNS_20  was 
0.021,  which  indicated  that  the  duration  of  the  conflict  maintained  its  significance  as  a 
covariate.  A  likelihood  ratio  test  was  performed  using  the  computation  described  in 
Section  0  to  compare  the  reduced  model  to  the  initial  model.  This  comparison  was 
computed  as  two  times  the  difference  between  the  log-likelihood  of  the  initial  model  and 
the  log-likelihood  of  the  reduced  model.  That  is,  G  =  2(-l  1.31 -(-12.575))  =  2. 53.  The 
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critical  chi-square  value  for  this  eomparison  was  Jqqj  2  =  5.99  .  Because  the  likelihood 

ratio  statistic  was  smaller  than  the  eritieal  chi-square  value,  the  reduced  model,  like  the 
initial  model,  proved  to  be  adequate.  It  was  also  noted  that  the  likelihood  ratio  p-value 
for  the  redueed  model  in  Figure  25  was  0.005  <  0.05  ,  which  implied  that  at  least  one  of 
the  reduced  model  parameters  was  nonzero. 

Since  the  three  goodness-of-fit  tests  are  statistieally  equivalent,  the  interpretation 
of  the  MINITAB  output  for  each  test  was  straight  forward.  MINITAB  displayed  the  p- 
value  for  each  statistic,  which  needed  to  be  larger  than  a  =  0.05  to  imply  model 
adequacy.  The  deviance,  Pearson  chi-square,  and  Hosmer-Lemeshow  statistics  are  each 
approximately  distributed  ehi-square,  and  the  computations  differ  only  in  their  respective 
degrees  of  freedom.  Eaeh  of  the  goodness-of-fit  p-values  was  larger  than  a  ,  whieh 
implied  that  the  reduced  model  was  also  adequately  fit.  Thus,  the  final  logit  for  the  20* 
Century  extra-state  data  was  expressed  as 

g{Dur  _ES _UNS _20)  =  0.57 -{1.19* Dur _ES _UNS _20) ,  (4.1) 

and  the  logistie  regression  model  for  predieting  the  probability  of  Winner_ES_UNS_20 
was  given  by 


P {Winner  \  Duration)  = 


{). 51 Duration) 


1  +  e 


0. 51 -{\.\9*  Duration) 


(4.2) 


The  covariate  names  for  Winner  ES  UNS  20  and  Dur  ES  UNS  20  were  truncated  to 


Winner  and  Duration  for  the  purpose  of  explieitly  stating  the  model  in  equation  (4.2). 


85 


Odds  Ratio  Interpretation. 


Since  the  redueed  model  was  shown  to  be  eorreetly  speeified,  suffieient 
justifieation  existed  to  eontinue  with  odds  ratio  interpretation.  The  odds  ratio  for  the 
redueed  model,  at  0.3,  implied  an  inereased  likelihood  of  the  non-state  aetor  emerging  as 
the  winner.  Using  the  deseription  surrounding  equation  (2.16)  in  Chapter  2,  the  odds 
ratio  of  0.3  suggested  that  an  extra-state  war  was  0.3  times  as  likely  to  end  with  the  state 
as  the  vietor  than  with  the  non-state  partieipant  as  the  winner,  given  a  single-unit  inerease 
in  the  duration  of  the  eonfliet.  The  95%  Cl  showed  that  this  ratio  eould  be  as  small  as 
0.1 1  or  as  large  as  0.84.  The  tight  range  of  the  Cl  demonstrated  a  high  level  of 
eonfidenee  in  the  aeeuraey  of  the  estimated  odds  ratio.  The  odds  ratio  was  smaller  than 
1 ,  so  it  actually  implied  that  the  non-state  aetor  was  more  likely  to  win  in  a  long  war 
rather  than  the  state.  Defining  a  one-unit  inerease  in  war  duration  allowed  a  more 
aeeurate  assessment  of  the  odds  ratio.  Sinee  unit  normal  sealing  was  used,  the  length  of  a 
single  unit  of  war  duration  was  denoted  by  the  sample  standard  deviation  of  the  extra¬ 
state  wars  duration  data,  whieh  was  eomputed  to  be  1426.19.  By  inverting  the  odds  ratio, 
it  followed  that  for  about  every  1426  days  that  an  extra-state  war  lasts,  the  non-state 
partieipant  is  approximately  3.33  times  as  likely  to  emerge  as  the  winner  than  the  state 
partieipant.  Therefore,  in  general,  this  result  suggests  that  a  long  extra-systemie  war 
favors  the  insurgeney.  This  was  a  partieularly  unsettling  finding,  given  that  the  United 
States  has  been  engaged  in  the  eurrent  war  in  Iraq  for  nearly  1460  days. 
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Diasnostics  and  Plots. 


Three  diagnostic  plots  were  examined;  AD .  versus  rCj ,  AXj  versus  fCj ,  and  Ay0  . 

versus  nj .  Large  values  of  AD^  and  IsX^.  indicated  covariate  patterns  which  were 

poorly  fit.  These  values  could  be  identified  by  being  located  in  the  top  left  or  top  right 
comers  of  the  plots.  Additionally,  points  far  separated  from  the  general  pattern  of  the 
remaining  points  could  also  be  classified  as  poorly  fit  (Hosmer  and  Lemeshow, 
2000:176-179).  Figure  26  is  the  plot  of  the  change  in  model  deviance  versus  the 
estimated  probability  of  Winner _ES_UNS_20.  The  plot  was  examined  for  large  values 
of  IsDj .  However,  given  that  the  goodness-of-fit  tests  showed  the  reduced  model  to  be 

correctly  specified,  very  few  poorly  fit  covariate  patterns  were  expected  to  appear. 


Deviance  Change  Plot  (Univariate  Extra-State  Wars) 


Figure  26:  Deviance  Change  Plot  for  20th  Century  Extra-State  Wars  Model 
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Three  data  points  stood  out  as  having  large  values  for  AD^. .  This  implied  that  the 

five  observations  eorresponding  to  these  three  distinet  values  for  Dur_ES_UNS_20  were 
inadequately  fit.  The  five  observations  were  from  the  Italo-Libyan  War  of  1920,  the 
Indonesian  War  of  1945,  and  the  Western  Saharan  War  of  1975.  Their  respeetive 
durations  were  4444,  340,  and  1334  days.  These  distinet  values  for  duration  exerted 
leverage  on  the  fit  of  the  model.  Only  3  eovariate  patterns  out  of  20  were  identified  as 
poorly  fit.  Therefore,  it  was  unneeessary  to  remove  the  5  observations  eorresponding  to 
these  3  eovariate  patterns  and  estimate  a  new  model.  Nonetheless,  the  extra-systemie 
wars  identified  above  are  generally  unfamiliar  in  historieal  studies,  and  it  eould  be 
benefieial  to  devote  future  statistieal  investigations  to  them. 

The  plot  for  the  ehange  in  the  Pearson  ehi-square  statistie  versus  the  estimated 
probability,  shown  in  Figure  27,  was  also  examined  for  inadequately  fit  data  points.  This 
plot  indieated  the  same  inadequately  fit  observations  for  duration  as  did  the  plot  for  the 
ehange  in  devianee.  The  model  was  assessed  to  be  eorreetly  fit,  so  having  only  5  poorly 
fit  observations  out  of  n  =  24  was  eonsidered  aeeeptable.  That  is,  sufficient  evidence  did 
not  exist  to  imply  that  the  model  needed  to  be  estimated  again  with  the  five  observations 
removed. 
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Pearson  Chi-Square  Change  Plot  (Univariate  Extra-State  Wars) 


Figure  27:  Pearson  Statistic  Change  Plot  for  20th  Century  Extra-State  Wars  Model 

Large  values  of  t^ji-  were  expeeted  to  exhibit  similar  eharaeteristies  within  the 

plot  in  Figure  28  as  both  the  large  values  of  AD^.  in  Figure  26  and  the  large  values  of 

in  Figure  27.  In  eontrast  to  the  implications  from  large  values  of  and  AXj, 

values  of  .  that  were  both  large  and  distanced  from  the  general  clustering  of  the 

remaining  plotted  points  were  flagged  as  influence  points.  Specifically,  these  flagged 
points  corresponded  to  covariate  patterns  which  had  a  significant  effect  on  the  values  of 
the  model  parameters.  Any  influence  diagnostic  larger  than  1  provided  sufficient 
justification  for  deleting  all  observations  corresponding  to  it  and  estimating  a  new  model 
(Hosmer  and  Lemeshow,  2000:180). 
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Beta  Change  Plot  (Univariate  Extra-State  Wars) 


Figure  28:  Coefficient  Change  Plot  for  20th  Century  Extra-State  Wars 

There  were  two  such  covariate  patterns  in  Figure  28  which  were  considered 
highly  influential  in  parameter  estimation.  Three  observations  were  identified  by  these 
covariate  patterns;  Italian  participation  in  the  Italo-Libyan  War  of  1920,  British 
participation  in  the  Indonesian  War  of  1945,  and  Dutch  participation  in  the  Indonesian 
War  of  1945.  Because  of  the  high  degree  of  influence  these  wars  appeared  to  exert  on  the 
estimation  of  the  model  parameters,  a  future  statistical  investigation  of  these  wars  using  a 
source  with  more  complete  and  comprehensive  data  could  be  beneficial.  Such  an 
investigation  could  unveil  the  basis  of  the  influence  these  wars  had  on  the  20*  Century 
model  in  this  study.  Considering  that  only  two  covariate  patterns  out  of  twenty  were 
highly  influential,  the  reduced  model  was  deemed  to  be  a  generally  good  predictor  of  the 
winner  in  a  20*  Century  extra-state  war. 
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The  aforementioned  observations  were  removed  and  a  new  model  was  estimated. 


Summary  figures  and  diagnostic  plots  for  this  model  are  not  given  because  only  the 
changes  in  both  the  coefficients  and  odds  ratios  were  of  interest.  The  coefficient  for 
duration  in  Figure  25  was  =-1.19.  The  coefficient  in  the  revised  model  was 

computed  to  be  =  -4.96 ,  which  was  a  notable  change.  The  more  drastic  change, 

however,  was  in  the  odds  ratio.  The  odds  ratio  for  duration  in  Figure  25  was  0.3,  but  the 
odds  ratio  for  duration  in  the  revised  model  was  0.01.  The  odds  ratio  in  the  revised 
model  showed  that,  with  the  influential  observations  deleted,  the  non-state  faction  was 
100  times  more  likely  to  win  a  prolonged  extra-systemic  war  in  the  20**^  Century  than  was 
the  state. 

The  drastic  change  in  odds  ratios  from  the  reduced  model  to  the  revised  model 
demonstrated  the  significant  amount  of  influence  that  the  two  identified  colonial  wars  had 
on  a  logistic  regression  model  using  duration  to  predict  the  winner  of  a  20*  Century 
extra-systemic  war.  It  is  possible  that  other  unidentified  conditions  existed  within  both 
the  Italo-Libyan  War  and  the  Indonesian  War  that  could  account  for  their  influence  on  the 
results  of  the  model  in  this  study.  However,  comprehensive  data  concerning  these 
particular  wars  were  not  available  from  the  COWP. 


Aggregated  Extra-State  Wars  Model. 

The  results  from  the  stepwise  procedure  were  use  to  justify  estimating  a  univariate 
model  with  Duration_ES_UNS  as  the  covariate.  The  logistic  regression  table  is  given  in 
the  MINITAB  output  of  Figure  29. 
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Figure  29:  Logistic  Regression  Results  for  Aggregated  Extra-State  Wars 


The  p-value  for  the  likelihood  ratio  test  was  0.01,  whieh  suggested  that  at  least  one  of  the 
estimated  parameters  was  nonzero,  sinee  0.01  <  0.05  .  The  p-value  for  the  Wald  statistie 
on  duration  eonfirmed  that  Duration_ES_UNS  was  signifieant  to  the  model,  beeause 
0.015  <  0.05  .  Furthermore,  the  goodness-of-fit  tests  showed  the  model  to  be  eorreetly 
speeified,  as  eaeh  of  the  p-values  was  larger  than  a  =  0.05 .  The  p-value  for  the  devianee 
statistie,/)  =  0.15  ,  was  mueh  smaller  than  that  for  both  the  Pearson  ehi-square  and 
Hosmer-Lemeshow  statisties.  However,  the  degrees  of  freedom  for  both  and  D 
were  identieal,  and  the  devianee  statistie  was  larger  than  the  Pearson  ehi-square  statistie, 
so  the  smaller  p-value  for  the  devianee  was  understandable.  The  logit  for  this  model  is 

g  [Duration)  =  0.807665 -(0.71 8876*  Duration)  (4.3) 
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The  covariate  label  is  truncated  in  equation  (4.3)  for  the  purpose  of  expressing  the  form 
of  the  logit.  The  covariate  and  response  labels  are  truncated  also  to  express  the  form  of 
the  logistic  regression  model  for  this  case,  which  is 


P (Winner  \  Duration^  =  - 


gi^Duration) 


g{Dumtion) 


(4.4) 


Equation  (4.4)  yields  the  conditional  probability  of  the  winner  of  an  extra-systemic  war, 
given  that  the  war  lasts  a  certain  number  of  days.  In  general,  the  results  in  Figure  29 
confirmed  that  the  logistic  regression  model  containing  only  the  duration  of  the  conflict 
was  a  good  predictor  of  the  winner  of  an  extra-systemic  war. 


Odds  Ratio  Interpretation. 

The  odds  ratio  for  the  aggregated  model  was  slightly  larger  than  that  for  the  20**' 
Century  model.  However,  the  odds  ratio  still  favored  the  non-state  actor.  The  state 
participant  was  0.49  times  as  likely,  or  nearly  half  as  likely,  to  win  an  extra-state  war,  for 
every  approximate  1426-day  increase  in  the  duration  of  the  war.  Equivalently,  the  non¬ 
state  participant  was  almost  twice  as  likely  to  defeat  the  state  force,  for  every  1426  days 
that  the  conflict  continued.  As  with  the  20*  Century  model,  though  to  a  lesser  degree,  it 
appeared  that  a  long  war  strongly  favored  the  non-state  actor  in  an  extra-systemic  war. 

Why  was  the  non-state  actor  less  likely  to  be  victorious  when  the  data  were 
aggregated  than  when  the  20*  Century  data  were  considered  separately?  One  possible 
explanation  involved  considering  the  response  frequencies  between  the  two  models. 
Specifically,  for  the  20*  Century  model,  there  were  1 1  observations  in  which  the  non¬ 
state  actor  won,  while  for  the  aggregated  model,  there  were  19.  Hence,  the  distribution  of 
that  response  category  between  the  two  centuries  was  already  skewed  in  favor  of  the  20* 
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Century.  The  addition  of  8  observations  where  the  non-state  aetor  won  in  the  aggregated 
model  simply  increased  the  likelihood  of  a  non-state  victory. 

Another  explanation  could  come  from  the  specifics  of  the  19^’’  Century  extra-state 
wars,  since  most  of  these  were  colonial  wars.  Because  the  state  participant  was 
victorious  in  most  of  the  19**'  Century  wars,  additional  statistical  studies  into  the  tactics 
and  techniques  used  by  these  states  may  reveal  the  secrets  to  their  successes. 


Diasnostics  and  Plots. 

The  ls.p.  plot.  Figure  30,  was  examined  first.  It  displayed  a  greater  degree  of 
separation  between  the  influence  points  and  the  remaining  observations.  That  is,  the 
influence  points  were  easier  to  identify  in  the  ls./3j  plot  than  the  poorly  fit  covariate 

patterns  were  in  either  the  AXj  or  ISD.  plots. 
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Beta  Change  Plot  (Aggregated  Extra-State  Wars) 


Figure  30:  Beta  Change  Plot  for  Extra-State  Wars  (all  n  =  59  observations) 

Four  observations  were  designated  as  influenee  points.  Their  respeetive  values 
for  were  larger  than  those  of  the  remaining  data  points.  Three  of  these  influence 

points  corresponded  to  the  same  wars  identified  from  the  influence  points  for  the  20*^ 
Century  model.  The  fourth  corresponded  to  the  Franco-Tonkin  War  of  1873.  These 
same  influence  points  were  identified  in  both  the  ISX^j  and  plots,  which  are  given  in 
Figure  31  and  Figure  32. 
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Deviance  Change  Plot  (Aggregated  Extra-State  Wars) 


Figure  31:  Deviance  Change  Plot  for  Extra-State  Wars  (all  n  =  59  observations) 
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Pearson  Chi-Square  Change  Plot  (Aggregated  Extra-State  Wars) 


Figure  32:  Chi-Square  Chauge  Plot  for  Extra-State  Wars  (u  =  59  observatious) 

One  course  of  action  could  have  been  to  delete  the  influence  points  and  estimate  a 
new  model.  The  four  influence  points  only  covered  a  range  of  =0.15  to 

=  0.35  .  Based  on  the  recommendation  by  Hosmer  and  Lemeshow  that  values  of 
Ay0.  >  1  generally  indicate  the  necessity  for  a  new  model,  the  influence  points  above 

were  insufficiently  large  to  justify  fitting  a  new  model  (Hosmer  and  Lemeshow, 
2000:180).  Nonetheless,  the  six  observations  corresponding  to  the  four  influential 
covariate  patterns  were  deleted,  and  a  new  model  was  estimated  only  to  assess  the  change 
in  odds  ratios. 

The  odds  ratio  for  this  revised  model,  0.27,  revealed  an  even  greater  favor 
towards  an  insurgent  victory  than  that  of  the  original  model.  Rather  than  the  non-state 


97 


faction  being  nearly  two  times  more  likely  to  win  a  long  war,  as  was  explained  by  the 
0.49  odds  ratio  in  the  original  model,  the  insurgeney  was  now  more  than  four  times  more 
likely  to  win  a  long  war.  Granted,  the  ehange  in  likelihood  here  was  not  as  large  as  that 
from  the  20**^  Century  model,  but  the  results  were  still  diseoneerting.  It  eontinually 
appears  that  some  useful  insights  eould  be  gained  from  additional  investigations  into  the 
19*  Century  extra-systemie  wars,  partieularly  those  whieh  were  identified  as  influential 
in  this  study. 


20*  Century  Intrastate  Wars. 

The  initial  model  estimated  for  the  20*  Century  intrastate  wars  data  followed  the 
reeommendations  from  the  stepwise  seleetion  proeedure  and  eontained  three  eovariates: 
DurationJSJJNSJO,  C_Dead/TotDead_IS_UNS_20,  and  Dead/PopJS_UNS_20.  That 
is,  stepwise  regression  eonsidered  the  length  of  an  intrastate  war,  the  proportion  of  total 
deaths  borne  by  the  state  partieipant,  and  the  proportion  of  the  total  population  of  the 
state  eonsumed  by  war  deaths  as  signifieant  in  predieting  the  winner  of  an  intrastate  war. 
The  initial  model  fit  from  MINITAB  is  shown  in  Figure  33. 
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Binary  Logistic  Regression:  Winner_IS_UN  versus  DurationJ 

IS_,  Dead/Pop_IS_, . 

Response  Information 

Variable 

Value 

Count 

Winner  IS  UNS  20 

1 

26 

(Event) 

2 

17 

Total 

43 

Logistic  Regression  Table 

Odds  95% 

Cl 

Predictor 

Coef 

SE  Coef 

Z 

p 

Ratio  Lower  Upper 

Constant 

1 .26777 

0.534039 

2.37 

0.018 

Duration  IS  UNS  20 

-1 . 12622 

0.411762 

-2 . 74 

0.006 

0.32  0.14 

0.73 

Dead/ Pop_I S_UNS_2  0 

-0.359294 

0 .420409 

-0.85 

0.393 

0.7  0.31 

1.59 

C_Dead/TotDead_IS_UNS_20 

0.9499 

0.699907 

1.36 

0.175 

2.59  0.66 

10.19 

Log-Likelihood  =  -22.436 

Test  that  all  slopes  are 

zero:  G  =  12 

.841,  DF 

=  3,  P-Value  = 

=  0.005 

Goodness-of-Fit  Tests 

Method 

Chi-Square 

DF 

P 

Pearson 

39.8724 

39 

0.431 

Deviance 

44.8716 

39 

0.239 

Hosmer-Lemeshow 

4.5651 

8 

0.803 

Figure  33:  Results  for  luitial  lutrastate  Wars  Model  (20th  Ceutury) 


Each  of  the  p-values  for  the  goodness-of-fit  tests  were  larger  than  0.05,  so  those  statisties 
showed  the  model  to  be  adequately  fit.  Additionally,  the  p-value  for  the  likelihood  ratio 
test  was  0.005,  whieh  was  smaller  than  a  =  0.05  .  This  result  rejected  the  null  hypothesis 

of  equation  (2.29)  and  indieated  that  at  least  one  p  .  was  nonzero.  The  next  task  was  to 

determine  whieh  of  the  three  eovariates  were  signifieant  to  the  model.  Thus,  the  p-value 
for  eaeh  Wald  statistie,  from  equation  (2.32),  was  examined  to  determine  covariate 
significance. 

The  p-value  for  each  Wald  statistic  is  found  in  the  fourth  eolumn  of  the  logistic 
regression  table  in  Figure  33.  Only  one  of  the  three  covariates  was  found  to  be 
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significant.  The  p-values  for  C_Dead/TotDead_IS_UNS_20  and  Dead/P op _IS_UNS_20 
were  0.175  and  0.393,  respeetively.  Sinee  0.175  >  0.05  and  0.393  >  0.05  ,  neither  of 
these  eovariates  were  signifieant.  The  question  was  raised  as  to  why  these  two  eovariates 
were  allowed  to  enter  the  model  in  the  stepwise  seleetion  proeedure  but  were  not  truly 
signifieant  to  it.  The  purpose  of  stepwise  seleetion  was  to  provide  guidanee  in 
eonstrueting  an  adequate  model.  That  is,  the  results  from  stepwise  regression  provided  a 
set  of  eovariates  whieh  would  yield  a  logistie  regression  model  deemed  adequate  by  the 
goodness-of-fit  tests.  Consequently,  individual  signifieanee  was  not  part  of  the  stepwise 
assessment.  The  p-value  for  Duration_IS_UNS_20,  however,  did  imply  signifieanee, 
sinee  0.006  <  0.05  .  The  next  logieal  step  was  to  fit  a  new  logistie  regression  model 
eontaining  only  Duration_IS_UNS_20. 

The  MINITAB  output  for  the  redueed  model  is  given  in  Figure  34.  The  model 
was  univariate  and  took  the  form  of  equation  (2.1).  The  goodness-of-fit  p-values  were 
eaeh  again  larger  than  a  =  0.05 ,  whieh  implied  model  adequaey.  The  p-value  for  the 
Wald  statistie  of  Duration  _IS_UNS_20  was  0.004,  whieh  eonfirmed  that  Duration 
_IS_UNS_20  maintained  its  position  as  a  signifieant  predietor  of  the  winner  of  a  20**^ 
Century  intrastate  war.  The  results  from  the  goodness-of-fit  and  Wald  tests  showed  that 
not  only  was  the  redueed  model  adequate,  but  also  that  it  was  eorreetly  speeified  from  the 
available  COWP  data. 
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Binary  Logistic  Regression:  Winner_iS_UNS_20  versus  Duration_iS_UNS_20 

Response  Information 


Variable 

Winner  IS  UNS  20 


Value  Count 

1  26  (Event) 

2  17 

Total  43 


Logistic  Regression  Table 


Odds 

95%  Cl 

Predictor 

Coef 

SE  Coef 

Z 

P  Ratio 

Lower  Upper 

Constant 

0.867621 

0.390469 

2.22 

0.026 

Duration  IS  UNS 

20  -1.08163 

0.375676 

-2.88 

0.004  0.34 

0.16  0.71 

Log-Likelihood  = 

=  -23.504 

Test  that  all  slopes  are  zero: 

G  =  10.706, 

DF  = 

1,  P-Value 

=  0.001 

Goodness -of- Fit 

Tests 

Method 

Chi-Square 

DF 

P 

Pearson 

42 . 6244 

39 

0.318 

Deviance 

47 .0073 

39 

0.177 

Hosmer-Lemeshow 

8.4269 

8 

0.393 

Figure  34:  Reduced  Model  Results 


Odds  Ratio  Interpretation. 

The  odds  ratio,  at  0.34,  suggested  that  the  rebel  or  insurgent  faetion  was  more 
likely  to  win  an  intrastate  war,  given  a  single-step  inerease  in  the  duration  of  the  confliet. 
The  referenee  eategory  for  Winner _IS_UNS_20  was  1,  eorresponding  to  a  state  vietory. 
Therefore,  the  odds  ratio  needed  to  be  larger  than  1  in  order  to  imply  a  greater  likelihood 
of  the  state  winning  an  intrastate  war  than  the  non-state  actor.  Just  as  with  the  extra-state 
models,  a  unit-length  increase  in  duration  needed  to  be  defined  such  that  the  odds  ratio 
could  be  accurately  interpreted.  After  reversing  the  unit  normal  scaling  procedure 
described  by  equation  (3.6),  a  one-step  change  in  intrastate  war  duration  was  found  to  be 
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approximately  1679  days.  Thus,  given  a  duration  of  slightly  more  than  four  and  a  half 
years,  the  rebel  faetion  was  nearly  three  times  more  likely  to  emerge  vietorious  than  the 
state  from  an  intrastate  war  in  the  20**^  Century. 


The  eombination  of  goodness-of-fit  tests,  diagnostie  plot  examination,  and  odds 
ratio  interpretation  demonstrated  that  for  the  variables  and  data  available  from  the 
COWP,  a  univariate  model  eontaining  the  duration  of  an  intrastate  war  adequately 
predieted  the  winner  of  the  eonfliet.  The  logit  for  the  redueed  model  was  expressed  as 

g[Duration)  =  0. S67 62 -{l. 08163 Duration),  (4.5) 

and  the  binary  logistie  regression  model  for  20**'  Century  intrastate  wars  was  given  by 

g{^Duration) 


P (Winner  \  Duration)  = 


1  +  e 


g^Duration) 


(4.6) 


Again,  the  eovariate  labels  Duration_IS_UNS_20  and  Winner _IS_UNS_20  were 
truneated  for  the  purpose  of  explieitly  expressing  the  logit  and  binary  model.  Equation 
(4.6)  yields  the  eonditional  probability  of  the  winner  of  an  intrastate  war,  given  that  the 
war  lasts  a  eertain  number  of  days. 


Diasnostics  and  Plots. 

The  diagnostie  plots  were  examined  next  to  loeate  influenee  points.  The  plots  for 
ISX^j ,  ISD. ,  and  ls.p.  are  given  in  Figure  35,  Figure  36,  and  Figure  37,  respeetively. 
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Pearson  Chi-Square  Change  Plot  (20th  Century  Intrastate  Wars) 


Figure  35:  Chi-Square  Chauge  Plot  for  Reduced  lutrastate  Wars  Model 
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Deviance  Change  Plot  (20th  Century  Intrastate  Wars) 


Figure  36:  Deviance  Change  Plot  for  Reduced  Intrastate  Wars  Model 
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Beta  Change  Plot  (20th  Century  Intrastate  Wars) 


Figure  37:  Beta  Change  Plot  for  Reduced  Intrastate  Wars  Model 

It  was  easy  to  identify  six  influence  points  from  the  t^ji-  plot.  The  poorly  fit  points  were 
not  as  apparent  in  either  the  plot  or  the  bJ).  plot.  The  six  influence  points 

corresponded  to  five  20*’’  Century  intrastate  wars:  the  Cambodia-Khmer  Rouge  War  of 
1970,  the  Pinochet  Rebellion  in  Chile  in  1973,  the  Somali  Secession  from  Ethiopia  in 
1976,  the  Communist  Rebellion  in  El  Salvador  in  1979,  and  the  Rename  Rebellion  in 
Mozambique  in  1979.  The  data  for  this  research  were  organized  at  the  participant  level, 
so  the  six  influence  points  concerned  specific  actors  in  the  aforementioned  intrastate 
wars.  Table  8  gives  the  war,  participant,  t^ji.  value,  and  duration  of  involvement 
identified  from  the  influence  points. 
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Table  8:  Observations  Identified  by  Infinence  Points 


Beta 

Dnration 

Intrastate  War 

Participant 

Change 

(days) 

Cambodia  vs.  Khmer  Rouge 

United  States 

0.147369 

977 

Republie  of  Vietnam 

0.178397 

766 

Chile  vs.  Pinoehet  Rebels 

Chile 

0.335365 

5 

Ethiopia  vs.  Somali  Rebels 

Somali  Rebels 

0.225726 

2376 

El  Salvador  vs.  Salvadorean  Democratie  Front 

El  Salvador 

0.242076 

4599 

Mozambique  vs.  Renamo 

Mozambique 

0.275127 

4733 

The  largest  of  these  values  was  about  0.34,  whieh  is  smaller  than  1,  so  the 

magnitudes  of  the  influenee  points  were  not  suffieient  to  justify  deleting  the  six 
observations  and  fitting  a  new  model. 


Aggregated  Intrastate  Wars. 

The  aggregated  intrastate  was  a  univariate  model  eontaining  Duration_IntS_UNS 
as  the  independent  variable,  as  reeommended  from  the  stepwise  seleetion  proeedure.  The 
results  from  the  model  estimation  are  given  in  Figure  38.  The  Pearson  ehi-square, 
Devianee,  and  Hosmer-Lemeshow  goodness-of-fit  tests  showed  the  model  to  be 
adequate.  Eaeh  of  the  p-values  for  the  goodness-of-fit  statisties  was  larger  than  a  =  0.05 , 
as  required  for  implying  a  good  model  fit.  The  p-value  for  the  likelihood  ratio  test  was 
0.002,  so  the  null  hypothesis  that  all  model  coeffieients  are  zero  was  rejeeted.  Thus,  the 
p-value  for  the  Wald  statistie  on  Duration _IntS_UNS  was  examined  to  determine  the 
individual  signifieanee  of  the  eovariate.  Sinee  0.003  <  0.05  ,  it  was  eoneluded  that  the 
duration  of  the  eonfliet  was  signifieant  to  the  model  predieting  the  winner. 
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Binary  Logistic  Regression:  WinnerJntS  versus  Duration_intS_UNS 

Response  Information 

Variable 

Value 

Count 

Winner  IntS 

1 

49 

(Event) 

2 

24 

Total 

73 

Logistic  Regression  Table 

Odds 

95%  Cl 

Predictor 

Coef 

SE  Coef 

Z  P  Ratio 

Lower  Upper 

Constant 

0.784946 

0.270836 

2.9  0.004 

Duration  IntS  UNS 

-0.804099 

0.271266 

-2.96  0.003  0.45 

0.26  0.76 

Log-Likelihood  =  -41. 

264 

Test  that  all  slopes 

are  zero: 

G  =  9.934 

,  DF  =  1,  P-Value  = 

=  0.002 

Goodness-of-Fit  Tests 

Method  Chi-Square 

DF 

P 

Pearson 

73.2111 

68 

0.311 

Deviance 

82.5286 

68 

0.111 

Hosmer-Lemeshow 

4.4029 

8 

0.819 

Figure  38:  Results  for  Uuivariate  lutrastate  Wars  Model 


Odds  Ratio  Interpretation. 

The  odds  ratio,  at  0.45,  suggested  that  the  non-state  aetor  was  still  more  likely  to 
win  an  intrastate  war,  given  a  single-step  inerease  in  the  duration  of  the  eonfliet.  Just  as 
with  the  20**^  Century  intrastate  model,  a  unit-length  inerease  in  duration  was 
approximately  1679  days.  Thus,  given  a  duration  of  slightly  more  than  four  and  a  half 
years,  the  rebel  faetion  was  over  two  times  more  likely  to  emerge  vietorious  than  the  state 
from  an  intrastate  war. 

The  eombination  of  goodness-of-fit  tests,  diagnostie  plot  examination,  and  odds 
ratio  interpretation  demonstrated  that  for  the  variables  and  data  available  from  the 
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COWP,  a  univariate  model  containing  the  duration  of  an  intrastate  war  adequately 
predicted  the  winner  of  the  conflict.  The  logit  for  the  reduced  model  was  expressed  as 


g  [Duration)  =  0JS-[0.S*  Duration) 


(4.7) 


and  the  binary  logistic  regression  model  for  aggregated  intrastate  wars  was  given  by 

^g[Duration) 

P [Winner  \  Duration)  = 


\  +  e 


g[Duration) 


(4.8) 


Again,  the  covariate  labels  Duration_IntS_UNS  and  Winner _IS_UNS  were  truncated  for 
the  purpose  of  explicitly  expressing  the  logit  and  binary  model.  Equation  (4.8)  yields  the 
conditional  probability  of  the  winner  of  an  intrastate  war,  given  that  the  war  lasts  a 
certain  number  of  days. 


Diasnostics  and  Plots. 

The  diagnostic  plot  for  is  given  in  Figure  39.  Four  influence  points  were 

clearly  distinguished  in  the  plot.  The  four  influence  points  corresponded  to  the  following 
intrastate  wars:  the  Russo-Circasian  War  of  1829,  the  Somali  Secession  from  Ethiopia  in 
1976,  the  Communist  Rebellion  in  El  Salvador  in  1979,  and  the  Rename  Rebellion  in 
Mozambique  in  1979.  The  observations  corresponding  to  the  influence  points  concerned 
the  following  participants:  Russia,  Somali  rebels,  El  Salvador,  and  Mozambique. 

However,  the  Ay9^  values  for  these  influence  points  were  much  smaller  than  1,  so  there 
was  insufficient  evidence  to  suggest  deleting  these  observations  and  fitting  a  new  model. 
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Beta  Change  Plot  for  Intrastate  Wars 


Figure  39:  Beta  Change  Plot  for  Aggregated  Intrastate  Wars  Model 

The  poorly  fit  covariate  patterns  were  not  as  easily  identified  in  either  the 

plot  or  the  plot.  Two  patterns  were  identified  as  poorly  fit.  That  is,  2  of  the  68 

distinct  covariate  values  did  not  follow  the  general  pattern  of  the  plots  as  did  the 
remaining  66.  Four  observations  corresponded  to  these  poorly  fit  covariate  patterns.  The 
plot  for  ISDj  is  shown  in  Figure  40,  and  the  plot  for  ISX^.  is  shown  in  Figure  41. 
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Deviance  Change  for  Aggregated  Intrastate  Wars  Model 


Figure  40:  Deviance  Change  Plot  for  Aggregated  Intrastate  Wars 


no 


Pearson  Chi-Square  Change  for  Aggregated  Intrastate  Wars  Model 


Figure  41:  Pearson  Chi-Square  Change  Plot  for  Intrastate  Wars 

The  intrastate  wars  corresponding  to  the  two  poorly  fit  patterns  were  the  War 
Between  the  States  and  the  Somali  Secession  from  Ethiopia  in  1976.  Since  only  4  out  of 
n  =  73  observations  were  associated  with  these  patterns,  there  was  insufficient  evidence 
to  suggest  deleting  the  4  data  points  and  estimating  a  new  model.  Furthermore,  the  AD^. 

and  values  for  these  2  patterns  were  moderate  in  relation  to  the  rest  of  the  points  on 

the  plots,  so  noting  the  range  on  which  their  estimated  probabilities  lied  gave  additional 
insights  into  the  amount  of  leverage  they  exerted  on  the  estimation  of  the  model 
coefficients. 

For  Figure  41,  the  data  point  for  Union  involvement  in  the  War  Between  the 
States  possessed  a  delta  chi-square  value  of  =  0.86  ,  delta  deviance  value  of 
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AZ)23  =  1.43  ,  and  a  leverage  value  of  =  0.03  .  Its  estimated  probability  falling  within 


the  region  0.7  <  0.9  implied  that  its  leverage  was  moderate,  compared  to  the 

other  observations  (Hosmer  and  Lemeshow,  2000:175).  An  examination  of  the  plots  of 
AXj  versus  hj  and  ADj  versus  hj  ,  given  in  Figure  42  and  Figure  43,  respectively, 

showed  this  to  be  the  case.  That  is,  its  leverage  value  was  sufficiently  large  to  have  a 
moderate  effect  on  the  estimation  of  the  model  parameters. 


Delta  Chi-Square  vs.  Leverage  (Aggregated  Intrastate  Wars) 


Figure  42:  Pearson  Chi-Square  Change  vs.  Leverage  Plot  for  Intrastate  Wars 
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Delta  Deviance  vs.  Leverage  (Aggregated  Intrastate  Wars) 


Figure  43:  Deviance  Change  vs.  Leverage  Plot  for  Intrastate  Wars 

In  contrast,  the  observation  coneeming  the  Somali  rebels  possessed  the  values 
AX^g  =  2.06  ,  ADjg  =  3.12 ,  and  h^g  =  0.06  .  Its  estimated  probability,  however,  lied  on 

the  range  0.3  <  ^[x^g)<  0.7  .  These  values,  with  the  exception  of  its  estimated 

probability,  were  larger  than  those  for  the  aforementioned  observation,  and  its  leverage 
fell  within  a  cluster  of  1 1  data  points  whose  leverages  were  considered  large  in 
comparison  to  those  of  the  remaining  62  observations.  Therefore,  this  observation  was 
not  only  an  influence  point,  but  it  also  exerted  a  greater  amount  of  leverage  on  the 
estimation  of  the  model  eoefficients  than  did  the  aforementioned  observation. 

Overall,  the  aggregated  model  for  intrastate  wars  was  considered  to  be  a  good 
predictor  of  the  winner.  It  was  not  necessary  to  delete  the  observations  identified  from 
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the  diagnostic  plots  and  estimate  a  new  model,  because  their  respective  values  of  , 

bJ). ,  and  were  not  large  enough  to  justify  such  an  action.  However,  additional 

investigations  into  the  aforementioned  influential  wars  may  be  necessary  to  determine  the 
nature  of  their  effects  on  the  model  presented  in  this  study. 


Multinomial  Logistic  Regression  Models  on  Outcome 

Two  multinomial  models  were  estimated  for  predicting  the  outcome  of  an 
interstate  war.  An  initial  model  for  the  20*  Century  interstate  wars  data  set  contained  the 
covariates  for  conflict  duration  and  the  proportion  of  total  deaths  borne  by  the  participant, 
or  Duration _UNS_20  and  Dths/TDeaths_UNS_20.  The  two  covariates  included  in  the 
initial  20*  Century  model  resulted  from  the  stepwise  selection  recommendation. 
Examination  of  their  Wald  statistics  determined  which,  if  not  both,  covariates  was  truly 
significant  to  the  interstate  wars  model  at  the  a  =  0.05  level.  The  model  for  the 
aggregated  interstate  wars  contained  only  one  covariate:  Deaths/TotDeathsJJNS.  The 
results  for  the  20*  Century  data  are  presented  first.  The  Pearson  chi-square  and  Deviance 
goodness-of-fit  tests  were  computed  for  each  of  these  multinomial  models. 

The  ultimate  objective  of  this  investigation  was  to  demonstrate  the  applicability  of 
multinomial  logistic  regression  to  war  termination  studies.  The  summary  figures  from 
the  MINITAB  outputs  were  considered  sufficient  to  accomplish  this  goal.  Each  figure 
contains  the  coefficient  value,  standard  error  of  the  coefficient,  Wald  statistic,  p-value  of 
the  Wald  statistic,  odds  ratio,  and  95%  confidence  limits  on  the  odds  ratio  for  each  of  the 
covariates  in  each  of  the  logits  in  the  multinomial  model.  The  frequency  of  each 
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outcome  can  be  found  at  the  top  of  each  figure.  The  log-likelihood,  likelihood  ratio 
statistic,  p-value  of  the  likelihood  ratio  statistic,  Pearson  chi-square  statistic,  p-value  of 
the  Pearson  chi-square  statistic.  Deviance  statistic,  and  p-value  of  the  Deviance  statistic 
for  the  multinomial  model  are  given  at  the  bottom  of  each  figure. 

Each  logit  was  referenced  to  the  first  outcome,  or  Victory  by  Military  Imposition. 
As  such,  each  odds  ratio  was  a  comparison  of  the  outcome  in  question  to  the  reference 
outcome.  The  odds  ratio  quantified  how  much  more  or  less  likely  the  outcome  in 
question  was  to  occur  than  the  reference  outcome,  given  a  unit  increase  in  the  covariate 
values.  The  odds  ratios  were  important  to  detecting  patterns  within  the  COWP  data. 


20***  Century  Interstate  Wars 

The  initial  model  for  the  20**'  Century  data  was  bivariate.  This  model  was 
estimated  in  response  to  the  results  from  the  stepwise  selection  procedure.  The  goodness- 
of-fit  statistics  and  the  individual  Wald  statistics  were  examined  to  determine  if  the  initial 
model  was  sufficient  to  warrant  further  analysis.  The  initial  model  results  are  given  in 
Figure  44. 

The  p-values  for  the  two  goodness-of-fit  statistics  were  very  high,  which 
suggested  the  initial  model  to  be  adequately  estimated.  This  was  expected,  in  light  of  the 
results  from  the  stepwise  procedure  in  Chapter  II.  The  p-value  for  the  likelihood  ratio 
statistic  was  smaller  than  0.001,  which  rejected  the  null  hypothesis  in  equation  (2.29)  and 

suggested  that  at  least  one  jlj  was  nonzero. 

The  p-values  for  the  Wald  statistics,  however,  indicated  that  only  one  of  the 
covariates  was  significant  to  the  model  at  the  0.05  significance  level.  Each  of  the  p- 
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values  for  the  Wald  statisties  coneerning  Dths/TDeaths_UNS_20  was  smaller  than  0.001. 
This  implied  that  the  proportion  of  total  eombat  deaths  sustained  by  the  partieipant 
should  be  the  only  eovariate,  among  the  available  COWP  data,  ineluded  in  a  multinomial 
model  for  20**^  Century  interstate  wars.  In  contrast,  each  of  the  Wald  statistic  p-values  for 
the  duration  of  the  conflict  was  larger  than  a  .  Thus,  a  reduced  model  containing  only 
Dths/TDeaths_UNS_20  was  estimated.  The  implication  from  the  statistics  in  Figure  44 
was  that  the  duration  of  an  interstate  war  was  not  important  to  the  outcome  of  a  20**^ 
Century  interstate  conflict.  The  length  of  the  war  may  actually  be  important,  but  the 
COWP  data  did  not  reveal  such  a  trend.  Therefore,  it  should  be  stated  that  additional 
studies  into  interstate  wars  using  other  data  sources  may  be  necessary  to  identify  other 
relevant  variables  which  were  not  available  in  the  COWP  data. 

Figure  45  shows  the  results  from  fitting  the  reduced  multinomial  model.  Not  only 
did  both  goodness-of-fit  statistics  show  the  model  to  be  adequate,  but  also  the  Wald 
statistic  p-value  for  Dths/TDeaths_UNS_20  was  smaller  than  0.001,  which  implied  that 
the  single  covariate  maintained  its  significance  to  the  model. 

The  odds  ratios  were  interpreted  individually.  A  one-unit  change  in 
Dths/TDeaths_UNS_20  was  defined  for  the  purpose  of  interpreting  the  odds  ratios.  The 
standard  deviation  for  the  proportion  of  total  deaths  home  by  the  participant  was 
computed  to  be  0.26,  so  each  odds  ratio  was  interpreted  for  an  approximate  26%  increase 
in  Dths/TDeaths  UNS  20. 


116 


Nominal  Logistic  Regression:  Outcome(PR2)  versus  Dths/TDeaths,  Duration 

_UNS 

Response  Information 

Variable 

Value 

Count 

Outcome (PR2)  20 

1 

62  (Reference  Event) 

5 

23 

4 

16 

3 

29 

2 

37 

Total 

167 

Logistic  Regression  Table 

Odds 

95% 

CI 

Predictor 

Coef 

SE  Coef  Z 

P 

Ratio 

Lower 

Upper 

Logit  1 :  (5/1 ) 

Constant 

-0.72514 

0.286304  -2.53 

0.011 

Dths/TDeaths  UNS  20 

1.25622 

0.305132  4.12 

0 

3.51 

1 . 93 

6.39 

Duration  UNS  20 

-0.24161 

0.270117  -0.89 

0.371 

0.79 

0.46 

1.33 

Logit  2:  (4/1) 

Constant 

-1.20834 

0.351733  -3.44 

0.001 

Dths/TDeaths  UNS  20 

1.3249 

0.334455  3.96 

0 

3.76 

1 . 95 

7.25 

Duration  UNS  20 

-0.598418 

0.42479  -1.41 

0.159 

0.55 

0.24 

1.26 

Logit  3:  (3/1) 

Constant 

-0 .425834 

0.259768  -1.64 

0.101 

Dths/TDeaths  UNS  20 

0 . 984011 

0.292405  3.37 

0.001 

2.68 

1 .51 

4.75 

Duration  UNS  20 

-0 .118238 

0.221858  -0.53 

0.594 

0.89 

0.58 

1 .37 

Logit  4:  (2/1) 

Constant 

-0.212314 

0.247951  -0.86 

0.392 

Dths/TDeaths  UNS  20 

1.12896 

0.278471  4.05 

0 

3.09 

1.79 

5.34 

Duration  UNS  20 

0.0069449 

0.196298  0.04 

0.972 

1 . 01 

0.69 

1.48 

Log-Likelihood  =  -231 

.512 

Test  that  all  slopes 

are  zero:  G  = 

39.155,  DF  =  8 

,  P-Value  = 

0.000 

Goodness-of-Fit  Tests 

Method 

Chi-Square 

DF  P 

Pearson 

624 . 98 

636  0.615 

Deviance 

460.251 

636  1 

Figure  44:  Results  for  luitial  20th  Ceutury  luterstate  Wars  Model 


Nominal  Logistic  Regression:  Outcome(PR2)  20  versus 
Dths/TDeaths_UNS_20 

Response  Information 

Variable 

Value 

Count 

Outcome (PR2)  20 

1 

62  (Reference  Event) 

5 

23 

4 

16 

3 

29 

2 

37 

Total 

167 

Logistic  Regression  Table 

Odds 

95%  Cl 

Predictor 

Coef 

SE  Coef  Z 

P 

Ratio 

Lower  Upper 

Logit  1:  (5/1) 

Constant 

-0.729073 

0.285224  -2.56 

0.011 

Dths/TDeaths  UNS  20 

1 .30012 

0.303297  4.29 

0 

3.67 

2 

.03  6.65 

Logit  2:  (4/1) 

Constant 

-1 . 1367 

0.328547  -3.46 

0.001 

Dths/TDeaths  UNS  20 

1.40105 

0.329295  4.25 

0 

4.06 

2 

.13  7.74 

Logit  3:  (3/1) 

Constant 

-0.433706 

0.259406  -1.67 

0.095 

Dths/TDeaths  UNS  20 

1 .01319 

0.291191  3.48 

0.001 

2 . 75 

1 

.56  4.87 

Logit  4;  (2/1) 

Constant 

-0.207563 

0.246254  -0.84 

0.399 

Dths/TDeaths  UNS  20 

1.14257 

0.277783  4.11 

0 

3.13 

1 

.82  5.4 

Log-Likelihood  =  -233 

.  321 

Test  that  all  slopes 

are  zero:  G 

=  35 . 536,  DF  = 

4,  P- 

-Value 

= 

0.000 

Goodness-of-Fit  Tests 

Method 

Chi-Square 

DF  P 

Pearson 

543.695 

556  0.637 

Deviance 

407 . 684 

556  1 

Figure  45:  Summary  of  Results  for  20th  Ceutury  luterstate  Wars 


In  Logit  1 ,  the  outcome  Defeat  by  Negotiated  Settlement  was  compared  to  the 
reference  outcome  Victory  by  Military  Imposition.  Its  odds  ratio  was  3.67,  which  was 
expressed  using  equation  (2.38). 


II8 


=  3.67 


(4.9) 


P[Y^5\x  =  i)/ 

A  ^ _ /P[Y  =  \\x  =  i) 

P(7  =  5|x  =  z  +  0.26)/ 

/P(7  =  l|x  =  /  +  0.26) 

In  other  words,  a  participant  in  an  interstate  war  is  about  three  and  a  half  times  more 
likely  to  lose  the  war  through  a  negotiated  settlement  than  he  is  to  win  through  military 
imposition,  assuming  that  he  bears  more  than  one  quarter  of  the  total  casualties. 

In  Logit  2,  the  outcome  Victory  by  Negotiated  Settlement  was  compared  to  the 
reference  outcome.  With  an  odds  ratio  of  4.06,  equation  (2.38)  became 


p(7  =  4|x  =  /)/ 

/P{Y  =  \\x  =  i) 

P[Y  =  A\x  =  i  +  026)/ 

/P[Y  =  \\x  =  i  +  Q26) 


=  4.06. 


(4.10) 


That  is,  an  interstate  war  actor  is  about  four  times  more  likely  to  win  the  war  through  a 
negotiated  settlement  than  through  military  imposition,  assuming  that  he  bears  more  than 
one  quarter  of  the  total  casualties. 

In  Logit  3,  the  outcome  Stalemate  was  compared  to  the  reference  outcome.  Its 
odds  ratio  was  2.75,  so  equation  (2.38)  became 

P{Y  =  3 


x  =  /  +  0.26) 


P(T  =  3|x  =  z  +  0.26)/ 

/P{Y  =  \ 


p{Y  =  \\x  =  i) 


Therefore,  an  interstate  war  participant  is  2.75  times  more  likely  to  accept  the  war  as  a 
stalemate  than  he  is  to  win  it  by  military  imposition,  assuming  that  he  bears  more  than 
one  quarter  of  the  total  casualties. 
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In  the  fourth  and  final  logit,  the  outcome  Capitulation  was  compared  to  the 
reference  outcome.  It  possessed  a  3.13  odds  ratio,  which  was  substituted  into  equation 


(2.38). 


P[Y  =  l\x  =  i)/ 

/p{Y  =  \\x  =  i) 

P[Y  =  2\x  =  i  +  026)/ 

/P[Y  =  \\x  =  i  +  Q26) 


3.13 


(4.12) 


Thus,  a  participant  in  an  interstate  war  is  over  three  times  more  likely  to  capitulate  to  the 
demands  of  his  enemy  than  he  is  to  win  the  war  through  military  imposition,  assuming 
that  he  bears  more  than  one  quarter  of  the  total  casualties. 

It  was  interesting  to  notice  that  0^4  =  4.06  was  the  largest  of  the  odds  ratios.  It 
can  be  said  that  a  nation  involved  in  an  interstate  war  is  most  likely  to  be  on  the  side  that 
wins  through  a  negotiated  settlement  rather  than  win  by  military  imposition,  provided 
that  the  nation  in  question  bears  no  more  than  one  quarter  of  the  total  combat  deaths.  In 
other  words,  once  a  belligerent  in  an  interstate  war  has  taken  about  26%  of  the  total 
casualties,  he  should  begin  the  process  of  negotiations  to  end  the  war  on  terms  more 
favorable  to  him  than  to  his  enemy.  This  appeared  to  be  the  trend  when  20**^  Century 
interstate  wars  were  considered  alone. 


Aggregated  Interstate  Wars  Model. 

The  stepwise  selection  procedure  in  Chapter  II  suggested  that  an  aggregated 
interstate  wars  multinomial  model  be  univariate.  This  recommendation  left  no  room  for  a 
reduced  model,  so  the  univariate  model  was  estimated  with  Deaths/T otDeathsJJNS  as 
the  single  covariate.  The  Pearson  chi-square  and  Deviance  goodness-of-fit  tests  were 
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examined  to  assess  overall  model  adequaey,  and  the  p-value  for  the  Wald  statistie  in  eaeh 
of  the  four  logits  was  examined  to  determine  the  signifieanee  of  Deaths/TotDeathsJJNS. 
Eaeh  of  the  four  odds  ratios  was  also  interpreted  to  identify  the  most  likely  outeome  for  a 
nation  involved  in  an  interstate  war,  given  that  the  nation  has  aeeepted  a  eertain 
pereentage  of  the  total  battle  deaths.  Figure  46  shows  the  MINTAB  output  for  this 
multinomial  model. 

The  p-values  for  both  goodness-of-fit  tests  were  mueh  larger  than  a  =  0.05 ,  whieh 
implied  that  the  model  was  adequate.  Eaeh  of  the  Wald  statistie  p-values  was  mueh 
smaller  than  a  =  0.05 ,  whieh  eonfirmed  additionally  that  the  eovariate 
Deaths/TotDeathsJJNS  was  highly  signifieant  to  the  multinomial  model.  In  faet,  its 
Wald  statistie  p-value  in  all  but  one  of  the  logits  was  very  elose  to  zero. 

A  one-unit  ehange  in  Deaths/TotDeathsJJNS  had  to  be  defined  for  the 
purpose  of  interpreting  the  odds  ratios.  Beeause  unit  normal  sealing  was  the  data  sealing 
teehnique  used,  the  sample  standard  deviation  for  all  n  =  225  observations  of 
Deaths/TotDeathsJJNS  was  defined  as  a  single-step  ehange  in  the  value  of  the  eovariate. 
The  sample  standard  deviation  for  the  proportion  of  total  deaths  home  by  the  partieipant 
was  eomputed  to  be  0.258,  so  eaeh  odds  ratio  was  again  interpreted  for  an  approximate 
26%  inerease  in  Deaths/TotDeathsJJNS.  As  with  the  20**^  Century  model,  the  referenee 
outeome  for  the  aggregated  model  was  also  Victory  by  Military  Imposition,  or  eategory  1 . 
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Nominal  Logistic  Regression:  Outcome(PR2)  versus  Deaths/TotDeaths_UNS 


Response  Information 


Variable 

Value 

Count 

Outcome (PR2 ) 

1 

87 

(Reference  Event) 

5 

28 

4 

26 

3 

31 

2 

53 

Total 

225 

Logistic  Regression  Table 

Odds 

95%  Cl 

Predictor 

Coef 

SE  Coef 

Z 

P 

Ratio 

Lower  Upper 

Logit  1 :  (5/1 ) 

Constant 

-1 . 1022 

0 

.238411 

-4.62 

0 

Deaths/TotDeaths  UNS 

0. 975947 

0 

.236328 

4 . 13 

0 

2 . 65 

1.67  4.22 

Logit  2:  (4/1) 

Constant 

-1.13879 

0 

.239475 

-4.76 

0 

Deaths/TotDeaths  UNS 

0.880387 

0 

.241627 

3.64 

0 

2.41 

1.5  3.87 

Logit  3:  (3/1) 

Constant 

-0.915293 

0 

.218258 

-4.19 

0 

Deaths/TotDeaths  UNS 

0.666398 

0.23343 

2.85 

0.004 

1 . 95 

1.23  3.08 

Logit  4:  (2/1) 

Constant 

-0.393903 

0.18603 

-2.12 

0.034 

Deaths/TotDeaths  UNS 

0.760406 

0 

.202088 

3.76 

0 

2.14 

1.44  3.18 

Log-Likelihood  =  -321 

.021 

Test  that  all  slopes 

are  zero:  G 

= 

28.353, 

DF  = 

4,  P-Value  = 

=  0.000 

Goodness-of-Fit  Tests 

Method 

Chi-Square 

DF 

P 

Pearson 

688.882 

696 

0.569 

Deviance 

524 . 608 

696 

1 

Figure  46:  Results  for  Aggregated  luterstate  Wars  Model 


In  Logit  1 ,  the  outcome  Defeat  by  Negotiated  Settlement  was  compared  to  the 
reference  outcome  Victory  by  Military  Imposition.  Its  odds  ratio  was  2.65,  which  was 
expressed  using  equation  (2.38). 
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=  2.65 


(4.13) 


p(7  =  5|x  =  /) 


<p(7  =  l|x  =  z) 


P(7  =  5|x  =  z  +  0.26) 


<P(7  =  1|x  =  z  +  0.26) 


In  other  words,  a  participant  in  an  interstate  war  is  over  two  and  a  half  times  more  likely 
to  lose  the  war  through  a  negotiated  settlement  than  he  is  to  win  through  military 
imposition,  assuming  that  he  bears  more  than  one  quarter  of  the  total  casualties. 

In  Logit  2,  the  outcome  Victory  by  Negotiated  Settlement  was  compared  to  the 
reference  outcome.  With  an  odds  ratio  of  2.41,  equation  (2.38)  became 

p(7  =  4|x  =  z) 


/p{Y  =  \\x  =  i) 

0,4  =  ^ - - - =  2.41 

P(7  =  4  x  =  z  +  0.26) 


(4.14). 


^P[Y  =  \\x  =  i  +  0.26) 


That  is,  an  interstate  war  actor  is  nearly  two  and  a  half  times  more  likely  to  win  the  war 
through  a  negotiated  settlement  than  through  military  imposition,  assuming  that  he  bears 
more  than  one  quarter  of  the  total  casualties. 

In  Logit  3,  the  outcome  Stalemate  was  compared  to  the  reference  outcome.  Its 
odds  ratio  was  1.95,  so  equation  (2.38)  became 

P(j  =  2,\x  =  i)^ 

P(F  =  1 1  x  =  /) 

=  1.95.  (4.15) 


Or^  — 


<p(7  =  l|x  =  z) 


P(7  =  3|x  =  z  +  0.26) 


<P(7  =  1|x  =  z  +  0.26) 


Therefore,  an  interstate  war  participant  is  nearly  two  times  more  likely  to  accept  the  war 
as  a  stalemate  than  he  is  to  win  it  by  military  imposition,  assuming  that  he  bears  more 
than  one  quarter  of  the  total  casualties. 
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In  the  fourth  and  final  logit,  the  outcome  Capitulation  was  compared  to  the 
reference  outcome.  It  possessed  a  2.14  odds  ratio,  which  was  substituted  into  equation 


(2.38). 


O 


R2 


P[Y  =  2\x  =  i)/ 

/P{Y  =  \\x  =  i) 

P[Y  =  2\x  =  i  +  0.26)/ 

/p[Y  =  \\x  =  i +  0.26) 


(4.16) 


Thus,  a  participant  in  an  interstate  war  is  over  two  times  more  likely  to  capitulate  to  the 
demands  of  his  enemy  than  he  is  to  win  the  war  through  military  imposition,  assuming 
that  he  bears  more  than  one  quarter  of  the  total  casualties. 

The  largest  odds  ratio  of  2.65  implied  that  a  nation  would  most  likely  be  defeated 
through  a  negotiated  settlement,  assuming  that  the  nation  in  question  bore  more  than  one 
quarter  of  the  total  casualties.  A  stalemate  turned  out  to  be  the  least  likely  outcome  for 
the  same  conditions.  The  switch  from  victory  to  defeat  by  negotiated  settlement  between 
the  20**^  Century  and  aggregated  analyses  likely  resulted  from  the  effects  that  the  19**' 
Century  data  had  on  the  odds  ratio  calculations  in  the  aggregated  model.  Approximately 
88%  of  the  19*  Century  interstate  wars  identified  in  the  COWP  data  ended  by  force  of 
arms.  This  proportion  dropped  to  69%  when  the  interstate  wars  from  both  centuries  were 
considered  together.  One  might  conclude  that  a  far  greater  prominence  was  placed  on 
military  force  in  the  19*  Century  than  in  the  20*  Century. 

A  general  trend  of  ending  interstate  wars  by  a  negotiated  settlement  presented 
itself  through  the  analyses  of  all  interstate  wars  in  the  COWP  data  and  the  20*  Century 
interstate  wars  alone.  This  result  supports  a  similar  assertion  made  by  Walker  in  his 
Naval  War  College  study  (Walker,  1996:1).  It  was  also  interesting  to  note  that  the 
casualty  proportions  necessary  for  prompting  both  outcomes  were  virtually  equal.  Thus, 
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a  nation  involved  in  an  interstate  war  should  move  quiekly  for  a  favorable  negotiated 
settlement  onee  it  sustains  more  than  one  quarter  of  all  eombat  deaths. 


Summary 

The  results  in  this  ehapter  demonstrated  that  logistie  regression  teehniques  ean  be 
sueeessfully  applied  to  war  termination  problems.  Stepwise  seleetion  fulfdled  its  usual 
purpose  as  a  robust  teehnique  for  identifying  the  eovariates  neeessary  to  build  an 
adequate  logistie  regression  model  on  the  response.  For  the  19**' Century,  20**'  Century, 
and  aggregated  data  on  extra-systemie,  intrastate,  and  interstate  wars,  the  stepwise 
regression  results  were  examined  for  aeouraey.  No  logistie  regression  models  for  the  19**' 
Century  COWP  data  on  any  of  the  three  types  of  wars  were  estimated  beeause  of  the 
results  from  stepwise  regression.  Consequently,  two  models  were  fit  for  eaeh  war  type: 
one  for  the  20*  Century  COWP  data  and  one  for  the  aggregated  COWP  data. 

The  final  models  estimated  from  extra-systemie  war  data  were  found  to  be  good 
predietors  of  the  winner.  The  models  were  parsimonious,  and  the  winner  was  dependent 
only  on  the  length  of  the  war.  Interpretation  of  their  odds  ratios  revealed  that  the  non¬ 
state  belligerent  was  most  likely  to  win  a  long  extra-state  war  than  the  state  aetor.  The 
United  States  has  been  engaged  in  the  eurrent  war  in  Iraq  for  nearly  four  years,  whieh  is 
longer  than  the  1426-day  duration  ehange  identified  by  the  models.  The  Franeo-Tonkin 
War  of  1873,  the  Italo-Libyan  War  of  1920  and  the  Indonesian  War  of  1945  were  found 
to  be  influential  to  the  estimation  of  model  parameters.  Future  statistieal  studies  of  these 
wars  using  a  souree  with  more  eomplete  and  eomprehensive  data  may  reveal  the  reasons 
for  their  influenees  on  the  models  in  this  study. 
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The  two  models  estimated  from  the  COWP  data  on  intrastate  wars  were  also  good 
predietors  of  the  winner.  Again,  the  duration  of  the  eonfliet  was  found  to  be  the  only 
available  covariate  significant  to  predicting  the  winner.  The  odd  ratios  for  these  models 
showed  that  the  insurgent  faction  was  even  more  likely  to  win  an  intrastate  war  than  they 
were  an  extra-systemic  war.  However,  the  war  duration  requirement  was  longer  than  that 
for  the  extra-state  models,  about  four  and  a  half  years.  The  influential  secession 
movements  and  rebellions  identified  from  the  diagnostic  plots  of  both  models  could  be 
subjects  of  future  investigations  for  further  insights  into  their  influence  on  the  results  of 
this  study. 

A  general  trend  of  ending  interstate  wars  by  a  negotiated  settlement  presented 
itself  through  the  results  of  both  the  20*  Century  and  aggregated  models.  As  was  the 
case  with  the  models  on  extra-systemic  and  intrastate  wars,  the  final  multinomial  models 
on  interstate  wars  were  also  univariate.  The  single  covariate  significant  to  predicting  the 
outcome  of  an  interstate  war,  however,  was  not  the  length  of  the  war  but  the  percentage 
of  total  casualties  sustained  by  a  participating  nation.  The  odds  ratios  from  both  models 
implied  that  an  interstate  war  participant  should  seek  a  favorable  negotiated  peace  once 
he  has  incurred  more  than  25%  of  the  total  battle  deaths. 
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V.  Discussion 


Assessment  of  Current  Findings 

No  models  were  fit  using  any  of  the  19**'  Century  data.  As  a  result,  little  ean  be 
said  statistieally  regarding  shifts  in  war  termination  trends  between  eenturies.  On  the 
other  hand,  the  degree  to  whieh  the  parameters,  signifieanee  tests,  and  odds  ratios 
differed  between  the  20**'  Century  and  aggregated  models  did  demonstrate  the  amount  of 
influenee  that  19**'  Century  wars  exerted  on  overall  war  termination  trends. 

It  was  interesting  to  see  that  the  length  of  the  eonfiiet  was  most  relevant  for  both 
intrastate  and  extra-state  wars.  The  odds  ratios  between  the  20**^  Century  and  aggregated 
extra-state  wars  model  revealed  a  pattern  favoring  the  insurgeney  faetion  over  time.  The 
non-state  aetor  was  over  three  times  more  likely  to  win  when  the  20*  Century  data  were 
eonsidered  separately.  This  likelihood  deereased  for  the  aggregated  model,  and  the 
insurgeney  beeame  less  than  two  times  as  likely  to  win.  Thus,  when  duration  is 
eonsidered  alone,  an  insurgeney  is  more  likely  to  win  a  prolonged  war  than  the  state 
whieh  it  is  fighting. 

The  proportion  of  the  total  number  of  eombat  deaths  home  by  a  nation  involved  in 
an  interstate  war  was  the  most  relevant  variable  for  both  multinomial  models  eoneeming 
interstate  wars.  Eaeh  outeome  was  refereneed  to  the  most  frequent  outeome  of  vietory 
through  foree  of  arms.  It  was  diseovered  that  the  odds  ratios  for  the  remaining  outeomes 
were  larger  when  the  20*  Century  data  were  eonsidered  alone  than  when  the  entire  data 
set  was  analyzed.  The  implieations  for  eaeh  ease,  however,  were  different.  Given  that  a 
partieipating  nation  took  about  26%  of  the  total  easualties,  that  nation  was  more  likely  to 
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win  a  20**^  Century  interstate  war  through  a  negotiated  settlement  than  through  military 
imposition.  Pillar  reaehed  a  similar  eonelusion  by  stating  that  explieit  agreements  are  the 
most  eommon  form  of  terminating  interstate  wars  (Pillar,  1983:16-17).  His  assertion, 
however,  is  broad  in  the  sense  that  he  grouped  wars  ending  in  imposed  settlements  and 
wars  ending  by  negotiated  settlements  together,  whereas  this  researeh  analyzed  these  two 
outeomes  separately. 

The  odds  ratios  for  the  aggregated  interstate  wars  model  were  not  as  different 
from  eaeh  other  as  those  for  the  20*'^  Century  model.  Negotiated  settlements  still  proved 
prevalent,  as  defeat  and  vietory  by  negotiated  settlement  possessed  the  largest  odds  ratios 
of  2.65  and  2.41,  respeetively.  The  proportion  of  total  easualties  neeessary  for  the 
likelihood  of  these  outeomes  was  only  slightly  less  than  that  for  the  20**^  Century  model, 
at  about  25%.  The  pattern  identified  here  was  that  nations  involved  in  modern  interstate 
wars  eould  aeeept  larger  shares  of  the  total  easualties  and  still  emerge  vietorious  through 
negotiations  than  eould  those  nations  from  19**'  Century  interstate  wars. 


Opportunities  for  Future  Research 

Advaneed  statistieal  teehniques  may  be  applied  to  the  diagnostie  results  from  this 
researeh.  Speeifieally,  the  extra-state  and  intrastate  wars  identified  as  influential  to 
model  estimation  eould  be  tagged  for  more  in-depth  studies.  Case-study  approaehes  for 
these  wars  may  help  address  the  question  of  why  these  wars  proved  so  influential  in  this 
researeh.  This  may  be  espeeially  important  when  studying  wars  that  have  historieally 
reeeived  seant  attention. 


128 


The  Italo-Libyan  War  of  1920,  the  Indonesian  War  of  1945,  the  Western  Saharan 
War  of  1975,  and  the  Franeo-Tonkin  War  of  1873  were  identified  in  this  researeh  as 
influential  to  estimating  the  extra-systemie  wars  models.  These  wars  were 
geographieally  foeused  in  Afriea  and  Southeast  Asia,  whieh  may  prove  signifieant  in 
diseriminant  studies  on  extra-state  wars.  Rather  than  emphasizing  the  importanee  of 
geography,  one  might  diseriminate  between  the  eombatants  in  these  wars.  The  combat 
records  of  these  belligerents  may  be  of  interest.  Perhaps  a  multiple  discriminant  analysis 
(MDA)  could  be  performed  on  both  combatants  and  geography  of  these  wars. 

The  Cambodia-Khmer  Rouge  War  of  1970,  the  Pinochet  Rebellion  in  Chile  in 
1973,  the  Somali  Secession  from  Ethiopia  in  1976,  the  Communist  Rebellion  in  El 
Salvador  in  1979,  the  Rename  Rebellion  in  Mozambique  in  1979,  and  the  Russo- 
Circasian  War  of  1829  were  influential  to  estimating  the  intrastate  wars  model.  Case- 
studies  on  these  wars  may  provide  additional  insights  into  the  reasons  for  their  influences 
in  this  research.  Opportunities  for  discriminant  analyses  also  exist  for  these  wars.  One 
might  investigate  the  factors  that  separate  civil  wars  from  secession  wars. 

With  the  United  States  engaged  in  the  Global  War  on  Terror  (GWOT),  which  can 
be  considered  an  extra-systemic  war  or  series  of  extra-systemic  wars,  future  studies  on 
conventional  interstate  wars  might  not  prove  as  significant  to  contemporary  military 
operations  as  would  studies  on  intrastate  and  extra-state  wars.  However,  additional 
applications  of  logistic  regression  techniques  exist  for  interstate  wars.  Additional 
relevant  variables  would  need  to  be  identified  in  order  to  expand  upon  the  univariate 
main  effects  models  presented  in  this  thesis.  Instead  of  a  single  multinomial  logistic 
regression  model,  one  might  pair  the  possible  outcomes  of  interstate  wars  and  construct 
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binary  logistic  regression  models  for  eaeh  pair.  Using  this  approaeh,  one  might 
aeeurately  identify  influential  interstate  wars  that  warrant  further  statistieal  studies. 
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WarNo 

StateNum 

StateAbb 

YrBegl 

MonBegl 

DayBegl 

YrEndl 

MonEndl 

DayEndl 

YrBeg2 

MonBeg2 

DayBeg2 

YrEnd2 

MonEnd2 

DayEnd2 

Duration 

Deaths 

Outcome 

Initiate 

SysStat 


PrWarPop 

PrWarArm 

WestHem 

Europe 

Africa 

MidEast 

Asia 

Oceania 

Version 


Table  9:  Variables  and  Definitions  for  COWP  Interstate  Wars  Set 

War  number 

COW  country  code  of  participant 

Abbreviated  name  of  participant 

First  beginning  year  of  participant's  involvement 

First  beginning  month  of  participant's  involvement 

First  beginning  day  of  participant's  involvement 

First  ending  year  of  participant's  involvement 

First  ending  month  of  participant's  involvement 

First  ending  day  of  participant's  involvement 

Second  beginning  year  of  participant's  involvement  (-999  =  NA) 

Second  beginning  month  of  participant's  involvement  (-999  =  NA) 

Second  beginning  day  of  participant's  involvement  (-999  =  NA) 

Second  ending  year  of  participant's  involvement  (-999  =  NA) 

Second  ending  month  of  participant's  involvement  (-999  =  NA) 

Second  ending  day  of  participant's  involvement  (-999  =  NA) 

Length  of  war  participation  in  days 

Number  of  battle  related  deaths  sustained  by  participant's  armed 
forces  in  war  (-999  =  missing) 

War  outcome  for  participant  (1  =  on  winning  side,  2  =  on  losing  side, 

3  =  on  side  A  of  a  tie,  4  =  on  side  B  of  a  tie,  5  =  on  side  A  of  an 
ongoing  war,  6  =  on  side  B  of  an  ongoing  war) 

Did  state  initiate  war?  (0  =  no,  1  =  yes) 

System  membership  status  of  state  (1  =  neither  central  sub-system 
member  nor  major  power,  2  =  central  sub-system  member  only 
[only  relevant  1816  through  1919],  3  =  central  sub-system  member 
&  a  major  power  [only  relevant  1816  through  1919],  4  =  major  power  only) 

Pre-war  population  in  thousands  (number  from  year  war  begun,  -999  =  missing) 
Pre-war  armed  forces  in  thousands  (number  from  year  war  begun,  -999  =  missing) 
Did  state  participant  engage  in  fighting  in  war  in  Western  Flemisphere?  (0  =  no,  1  = 
yes) 

Did  state  participant  engage  in  fighting  in  war  in  Europe?  (0  =  no,  1  =  yes) 

Did  state  participant  engage  in  fighting  in  war  in  Africa?  (0  =  no,  1  =  yes) 

Did  state  participant  engage  in  fighting  in  war  in  Middle  East?  (0  =  no,  1  =  yes) 

Did  state  participant  engage  in  fighting  in  war  in  Asia?  (0  =  no,  1  =  yes) 

Did  state  participant  engage  in  fighting  in  war  in  Oceania?  (0  =  no,  1  =  yes) 

Version  number  of  data  set 
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Table  10:  Variables  and  Definitions  for  COWP  Extra-Systemic  Wars  Set 


WarNo 

StateNum 

StateAbb 

YrBegl 

MonBegl 

DayBegl 

YrEndl 

MonEndl 

DayEndl 

YrBeg2 

MonBeg2 

DayBeg2 

YrEnd2 

MonEnd2 

DayEnd2 

MinDur 

MaxDur 

Deaths 

hit  Side 

Initiate 

SysStat 


PrWarPop 

PrWarArm 

WestHem 


Europe 

Africa 

MidEast 

Asia 

Oceania 

Version 


War  number 

COW  country  code  of  participant 

Abbreviated  name  of  participant 

First  beginning  year  of  participant's  involvement 

First  beginning  month  of  participant's  involvement  (-999  =  missing) 

First  beginning  day  of  participant's  involvement  (-999  =  missing) 

First  ending  year  of  participant's  involvement 

First  ending  month  of  participant's  involvement  (-999  =  missing) 

First  ending  day  of  participant's  involvement  (-999  =  missing) 

Second  beginning  year  of  participant's  involvement  (-999  =  NA) 

Second  beginning  month  of  participant's  involvement  (-999  =  NA  or  missing) 

Second  beginning  day  of  participant's  involvement  (-999  =  NA  or  missing) 

Second  ending  year  of  participant's  involvement  (-999  =  NA) 

Second  ending  month  of  participant's  involvement  (-999  =  NA  or  missing) 

Second  ending  day  of  participant's  involvement  (-999  =  NA  or  missing) 

Minimum  length  of  war  participation  in  days* 

Maximum  length  of  war  participation  in  days* 

Number  of  battle  related  deaths  sustained  by  participant's  armed  forces  in  war 
(-999  =  missing) 

On  which  side  did  participant  intervene?  (0  =  NA/state  is  primary  actor  in  war, 

1  =  on  side  of  state;  2  =  on  side  of  colony/non-state,  3  =  on  neither  side) 

Did  state  initiate  war?  (0  =  no,  1  =  yes) 

System  membership  status  of  state  (1  =  neither  central  sub-system  member  nor  major 
power,  2  =  central  sub-system  member  only  [only  relevant  1816  through  1919], 

3  =  central  sub-system  member  &  a  major  power  [only  relevant  1816  through  1919], 

4  =  major  power  only) 

Pre-war  population  in  thousands  (number  from  year  war  begun,  -999  =  missing) 
Pre-war  armed  forces  in  thousands  (number  from  year  war  begun,  -999  =  missing) 
Did  state  participant  engage  in  fighting  in  war  in  Western  Flemisphere?  (0  =  no,  1  = 
yes) 

Did  state  participant  engage  in  fighting  in  war  in  Europe?  (0  =  no,  1  =  yes) 

Did  state  participant  engage  in  fighting  in  war  in  Africa?  (0  =  no,  1  =  yes) 

Did  state  participant  engage  in  fighting  in  war  in  Middle  East?  (0  =  no,  1  =  yes) 

Did  state  participant  engage  in  fighting  in  war  in  Asia?  (0  =  no,  1  =  yes) 

Did  state  participant  engage  in  fighting  in  war  in  Oceania?  (0  =  no,  1  =  yes) 

Version  number  of  data  set 
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WarNo 

StateNum 

StateAbb 

YrBegl 

MonBegl 

DayBegl 

YrEndl 

MonEndl 

DayEndl 

YrBeg2 

MonBegl 

DayBegl 

YrEndl 

MonEndl 

DayEndl 

MinDur 

MaxDur 

Deaths 

IntSide 

SysStat 

PrWarPop 

PrWarArm 

Version 


Table  11:  Variables  and  Definitions  for  COWP  Intrastate  Wars  Set 

War  number 

COW  country  code  of  participant 

Abbreviated  name  of  participant 

First  beginning  year  of  participant's  involvement 

First  beginning  month  of  participant's  involvement  (-999  =  missing) 

First  beginning  day  of  participant's  involvement  (-999  =  missing) 

First  ending  year  of  participant's  involvement 

First  ending  month  of  participant's  involvement  (-999  =  missing) 

First  ending  day  of  participant's  involvement  (-999  =  missing) 

Second  beginning  year  of  participant's  involvement  (-999  =  NA) 

Second  beginning  month  of  participant's  involvement  (-999  =  NA  or  missing) 

Second  beginning  day  of  participant's  involvement  (-999  =  NA  or  missing) 

Second  ending  year  of  participant's  involvement  (-999  =  NA) 

Second  ending  month  of  participant's  involvement  (-999  =  NA  or  missing) 

Second  ending  day  of  participant's  involvement  (-999  =  NA  or  missing) 

Minimum  length  of  war  participation  in  days* 

Maximum  length  of  war  participation  in  days* 

Number  of  battle  related  deaths  sustained  by  participant's  armed  forces  in  war  (-999= 
missing) 

On  which  side  did  participant  intervene?  (0  =  NA/state  is  undergoing  intra-state  war,  1 
=  on  side  of  state;  2  =  on  side  of  opposition,  3  =  on  neither  side) 

System  membership  status  of  state  (1  =  neither  central  sub-system  member  nor  major 
power,  2  =  central  sub-system  member  only  [only  relevant  1816  through  1919],  3  = 
central  sub-system  member  &  a  major  power  [only  relevant  1816  through  1919],  4  = 
major  power  only) 

Pre-war  population  in  thousands  (number  from  year  war  begun,  -999  =  missing) 
Pre-war  armed  forces  in  thousands  (number  from  year  war  begun,  -999  =  missing) 
Version  number  of  data  set 
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Table  12:  Variables  and  Definitions  for  COWP  MID  Data  Set 


Variable 

Variable 

Variable 

Nnmber 

Name 

Description 

1 

DispNum 

Dispute  Number 

2 

StDay 

Start  day  of  dispute  (-9  =  missing) 

3 

StMon 

Start  month  of  dispute  (-9  =  missing) 

4 

St  Year 

Start  year  of  dispute  (-9  =  missing) 

5 

EndDay 

End  day  of  dispute  (-9  =  missing) 

6 

EndMon 

End  month  of  dispute  (-9  =  missing) 

7 

EndYear 

End  year  of  dispute  (-9  =  missing) 

8 

Outcome 

Outeome  of  dispute: 

1 

Vietory  for  side  A 

2 

Vietory  for  side  B 

3 

Yield  by  side  A 

4 

Yield  by  side  B 

5 

Stalemate 

6 

Compromise 

7 

Released 

8 

Unelear 

9 

Joins  ongoing  war 

-9 

Missing 

9 

Settle 

Settlement  of  dispute: 

1 

Negotiated 

2 

Imposed 

3 

None 

4 

Unelear 

-9 

Missing 

10 

Fatality 

Fatality  level  of  dispute: 

0 

None 

1 

<26  deaths 

2 

26-100  deaths 

3 

101-250  deaths 

4 

251-500  deaths 

5 

501-999  deaths 

6 

>  999  deaths 

-9 

Missing 

11 

FatalPre 

Preeise  Fatalities,  if  known  (-9  =  missing) 

12 

MaxDur 

Maximum  duration  of  dispute 

13 

MinDur 

Minimum  duration  of  dispute 
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Table  13:  Variables  and  Definitions  for  MID  Set  (cont.) 


Variable 

Variable 

Variable 

Nnmber 

Name 

Description 

14 

HiAct 

Highest  action  in  < 

lispute  [bracketed 

numbers  refer  to  corresponding  hostility  level]: 

0 

No  militarized  action  [1] 

1 

Threat  to  use  force  [2] 

2 

Threat  to  blockade  [2] 

3 

Threat  to  occupy  territory  [2] 

4 

Threat  to  declare  war  [2] 

5  Threat  to  use  CBR  weapons  [2] 

6 

Threat  to  join  war 

7 

Show  of  force  [3] 

8 

Alert  [3] 

9 

Nuclear  alert  [3] 

10 

Mobilization  [3] 

11 

Fortify  border  [3] 

12 

Border  violation  [3] 

13 

Blockade  [4] 

14 

Occupation  of  territory  [4] 

15 

Seizure  [4] 

16 

Attack  [4] 

17 

Clash  [4] 

18 

Declaration  of  war  [4] 

19 

Use  of  CBR  weapons  [4] 

20 

Begin  interstate  war  [5] 

21 

Join  interstate  war  [5] 

-9 

Missing  [-9] 

15 

HostLev 

Hostility  level  of  dispute: 

1 

No  militarized  action 

2 

Threat  to  use  force 

3 

Display  of  force 

4 

Use  of  force 

5 

War 
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Table  14:  Variables  and  Definitions  for  MID  Set  (cont.) 


Variable 

Variable 

Variable 

Nnmber 

Name 

Description 

16 

Recip 

Reciprocated  dispute?  (1  =  yes,  0  =  no) 

17 

NumA 

Number  of  states  on  side  A 

18 

NumB 

Number  of  states  on  side  B 

19 

Linkl 

Links  to  other  disputes/wars  #1  (contains  dispute 
number  [variable  "DispNum"]  of  other  dispute; 
links  to  war  indicated  by  code  "W"  e.g.  "167W" 
is  link  to  war  number  167) 

20 

Link2 

Links  to  other  disputes/wars  #2 

21 

Link3 

Links  to  other  disputes/wars  #3 

22 

Ongo2001 

Ongoing  after  2001?  (0  =  concluded  before  12/31/2001, 

1  =  continuing  as  of  12/3 1/2001 

23 

Version 

Version  number  of  data  set 
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Table  15:  Variables  and  Definitions  for  National  Materiel  Capabilities 


StateAbb 

3  letter  country  abbreviation 

Ccode 

COW  Country  code 

Year 

Year  of  Observation 

IrSt 

Iron  and  steel  production  (thousands  of  tons) 

MilEx 

Military  expenditures  (thousands  of  2001  US  dollars) 

MilPer 

Military  Personnel  (thousands) 

Energy 

Energy  consumption  (thousands  of  coal-ton  equivalents) 

Tpop 

Total  Population  (thousands) 

Upop 

Urban  Population  (population  living  in  cities  with  population  greater  than  100,000) 

CINC 

Composite  Index  of  National  Capability  (CINC)  score 

Version 

Version  number  of  the  data  set 
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